Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

engine: support for system proxy settings #7255

Merged
merged 8 commits into from May 7, 2024
Merged

Conversation

sipsma
Copy link
Contributor

@sipsma sipsma commented May 2, 2024

(related issue: #6599)

This adds support for:

  • Automatically passing standard proxy settings (HTTP_PROXY, HTTPS_PROXY, NO_PROXY, ALL_PROXY, FTP_PROXY) from the engine to all containers.
  • Automatically passing custom GOPROXY settings from the engine container to the Go SDK.
    • Specifically, the engine container must have _DAGGER_ENGINE_SYSTEMENV_GOPROXY set for that value to be passed as GOPROXY to the Go SDK.
    • More details in commit message here: c20f761

There's integ test coverage for everything except:

  • FTP_PROXY - too obscure to be worth it right now
  • ALL_PROXY - seems obscure and interpreted differently by different programs (or not interpreted at all, i.e. by Go stdlib). The other tests cover passing of these env vars well enough anyways.

Commit messages have more thorough descriptions of plumbing+reasoning as needed.

@sipsma sipsma added this to the v0.11.3 milestone May 2, 2024
@sipsma sipsma requested a review from jedevc May 2, 2024 03:35
@sipsma sipsma force-pushed the proxy-support branch 2 times, most recently from 7a35636 to c20f761 Compare May 4, 2024 00:23
@sipsma sipsma marked this pull request as ready for review May 4, 2024 00:38
@sipsma sipsma requested review from jedevc and removed request for jedevc May 4, 2024 00:38
// settings from the engine container. It may be made public in the future with more
// refined design.
dagql.Func("__withSystemEnvVariable", s.withSystemEnvVariable).
Doc(`(Internal-only) Inherit this environment variable from the engine container if set there with a special prefix.`),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more just something to note - the number of "internal-only APIs" has kind of started to grow.

I'd really like to try splitting some of these out, ideally by having a concept of a "privileged" session, which serves those APIs - other calls wouldn't see it. Then codegen wouldn't generate any of these for those.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have something more fleshed out here I agree, but this __ prefix does result in them being hidden from codegen (just to be clear, not sure if that's what you were getting at with "Then codegen wouldn't generate any of these for those").

Comment on lines 165 to 178
switch {
case upperSet && lowerSet:
continue
case upperSet:
proc.Meta.Env = append(proc.Meta.Env, lowerProxyEnvName+"="+upperProxyVal)
case lowerSet:
proc.Meta.Env = append(proc.Meta.Env, upperProxyEnvName+"="+lowerProxyVal)
default:
val, ok := os.LookupEnv(upperProxyEnvName)
if ok {
proc.Meta.Env = append(proc.Meta.Env, upperProxyEnvName+"="+val, lowerProxyEnvName+"="+val)
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker, but I'm not sure I get this (even though this logic is borrowed from past behavior).

It seems like this attempts to set both upper HTTP_PROXY and lower http_proxy in the child container in the first 3 cases (is this standard?). But the default case doesn't have this logic.

We could definitely do with a comment explaining this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is that if we are going to set HTTP_PROXY or one of the other envs, then set it in both upper+lower cases. I believe some programs only interpret one or the other and worst case it's harmless to have both.

In the default case, we are still setting both upper and lower in the container. We just only support reading the upper case one from the actual engine itself, which is fine because users just need to set that one when custom provisioning an engine. I'll add comments for all this though.

@@ -130,6 +133,98 @@ func (w *Worker) Run(
)
}

func (w *Worker) addExtraEnvs(proc *executor.ProcessInfo) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, it's really nice to have this custom worker for things like this 🎉 I'm a fan 😆

sipsma and others added 7 commits May 7, 2024 12:42
This setting meant that containers could not resolve names that were
resolvable in the engine (even if direct IPs were still usable).

While supporting that is incredibly obscure, the need has arisen in
integ tests for proxy support, where the test setup has:
* a proxy service
* a nested engine with dep on the proxy service
* containers running in that nested engine that automatically get proxy
  settings from their engine

In this case, being able to use the service binding alias for squid,
which is set in the engine's /etc/hosts, massively simplifies TLS
testing since the proxy's certs can all reference just that alias.

From poking around, it does not seem that this setting was load-bearing
for us and was just inherited from the original `dnsname` configuration
we picked up (and in turn `dnsname` had just set that from the beginning
with no context).

Looking around online for use cases that call for this setting in
dnsmasq, it seems to generally be obscure/complex networking setups
where entries in /etc/hosts end up pointing to IPs local to a
container/machine but they want nested clients to point to a different
IP.

My imagination is limited, but I cannot currently think of any situation
in which this would apply to engine containers; the networking settings
of the engine itself should dictate the networking settings of the
containers so any aliases found in the engine's /etc/hosts can apply to
resolving names queried by containers.
* To be clear: this does *not* result in containers' /etc/hosts having
  different values from before. It just makes all names resolvable by
  the engine also resolvable by containers, which is consistent with the
  behavior of DNS names outside of aliases.

Signed-off-by: Erik Sipsma <[email protected]>
The previous commit that tweaked dnsmasq conf enables just the squid
alias to be used for the proxy endpoint, which massively simplifies the
rest of the test.

Signed-off-by: Erik Sipsma <[email protected]>
This is implemented in such a way that it's a bit more generic but
hidden from external clients for now.

The overall plumbing is:
1. A hidden (via `__` prefix) API on Container for specifying env vars
   that should be inherited from the engine container, where they must
   be prefixed with `_DAGGER_ENGINE_SYSTEMENV_` (e.g.
   `_DAGGER_ENGINE_SYSTEMENV_GOPROXY`).
2. The Container implementation sets metadata that the Executor needs
   via LLB *metadata*. Importantly, this is very different from the
   previous `ftp_proxy` hack in that this is just metadata and thus
   doesn't impact LLB digests. We specifically use the `Description` LLB
   metadata, which is explicitly doc'd as never being interpreted by the
   solver and thus safe to use for this.
3. The Worker's ResolveOp method registers the solver.Vertex (which
   includes that metadata) so that it can be read by the Executor as
   needed.
4. The Executor uses that metadata to set any requested env vars from
   the system.

Once we've fleshed out the design of support for these system env vars
we can open up the API.

This plumbing of executorMetadata is also intended to be easy to re-use
for any other needs that arise in the future similar to this.

Integ tests are also included here with a pretty nice GOPROXY
implementation usable a library.

Signed-off-by: Erik Sipsma <[email protected]>
Signed-off-by: Justin Chadwell <[email protected]>
@sipsma sipsma merged commit 0152419 into dagger:main May 7, 2024
61 checks passed
shykes pushed a commit to shykes/dagger that referenced this pull request May 7, 2024
* engine: support for system proxy settings

Signed-off-by: Erik Sipsma <[email protected]>

* networking: remove no-hosts from dnsmasq conf

This setting meant that containers could not resolve names that were
resolvable in the engine (even if direct IPs were still usable).

While supporting that is incredibly obscure, the need has arisen in
integ tests for proxy support, where the test setup has:
* a proxy service
* a nested engine with dep on the proxy service
* containers running in that nested engine that automatically get proxy
  settings from their engine

In this case, being able to use the service binding alias for squid,
which is set in the engine's /etc/hosts, massively simplifies TLS
testing since the proxy's certs can all reference just that alias.

From poking around, it does not seem that this setting was load-bearing
for us and was just inherited from the original `dnsname` configuration
we picked up (and in turn `dnsname` had just set that from the beginning
with no context).

Looking around online for use cases that call for this setting in
dnsmasq, it seems to generally be obscure/complex networking setups
where entries in /etc/hosts end up pointing to IPs local to a
container/machine but they want nested clients to point to a different
IP.

My imagination is limited, but I cannot currently think of any situation
in which this would apply to engine containers; the networking settings
of the engine itself should dictate the networking settings of the
containers so any aliases found in the engine's /etc/hosts can apply to
resolving names queried by containers.
* To be clear: this does *not* result in containers' /etc/hosts having
  different values from before. It just makes all names resolvable by
  the engine also resolvable by containers, which is consistent with the
  behavior of DNS names outside of aliases.

Signed-off-by: Erik Sipsma <[email protected]>

* cleanup integ tests

The previous commit that tweaked dnsmasq conf enables just the squid
alias to be used for the proxy endpoint, which massively simplifies the
rest of the test.

Signed-off-by: Erik Sipsma <[email protected]>

* add integ tests that use basic auth for proxy

Signed-off-by: Erik Sipsma <[email protected]>

* add integ tests for NO_PROXY

Signed-off-by: Erik Sipsma <[email protected]>

* add support for GOPROXY setting on the Go SDK

This is implemented in such a way that it's a bit more generic but
hidden from external clients for now.

The overall plumbing is:
1. A hidden (via `__` prefix) API on Container for specifying env vars
   that should be inherited from the engine container, where they must
   be prefixed with `_DAGGER_ENGINE_SYSTEMENV_` (e.g.
   `_DAGGER_ENGINE_SYSTEMENV_GOPROXY`).
2. The Container implementation sets metadata that the Executor needs
   via LLB *metadata*. Importantly, this is very different from the
   previous `ftp_proxy` hack in that this is just metadata and thus
   doesn't impact LLB digests. We specifically use the `Description` LLB
   metadata, which is explicitly doc'd as never being interpreted by the
   solver and thus safe to use for this.
3. The Worker's ResolveOp method registers the solver.Vertex (which
   includes that metadata) so that it can be read by the Executor as
   needed.
4. The Executor uses that metadata to set any requested env vars from
   the system.

Once we've fleshed out the design of support for these system env vars
we can open up the API.

This plumbing of executorMetadata is also intended to be easy to re-use
for any other needs that arise in the future similar to this.

Integ tests are also included here with a pretty nice GOPROXY
implementation usable a library.

Signed-off-by: Erik Sipsma <[email protected]>

* chore: tidy go.mod

Signed-off-by: Justin Chadwell <[email protected]>

* add clarifying comments to proxy env setting code

Signed-off-by: Erik Sipsma <[email protected]>

---------

Signed-off-by: Erik Sipsma <[email protected]>
Signed-off-by: Justin Chadwell <[email protected]>
Co-authored-by: Justin Chadwell <[email protected]>
vikram-dagger pushed a commit to vikram-dagger/dagger that referenced this pull request May 8, 2024
* engine: support for system proxy settings

Signed-off-by: Erik Sipsma <[email protected]>

* networking: remove no-hosts from dnsmasq conf

This setting meant that containers could not resolve names that were
resolvable in the engine (even if direct IPs were still usable).

While supporting that is incredibly obscure, the need has arisen in
integ tests for proxy support, where the test setup has:
* a proxy service
* a nested engine with dep on the proxy service
* containers running in that nested engine that automatically get proxy
  settings from their engine

In this case, being able to use the service binding alias for squid,
which is set in the engine's /etc/hosts, massively simplifies TLS
testing since the proxy's certs can all reference just that alias.

From poking around, it does not seem that this setting was load-bearing
for us and was just inherited from the original `dnsname` configuration
we picked up (and in turn `dnsname` had just set that from the beginning
with no context).

Looking around online for use cases that call for this setting in
dnsmasq, it seems to generally be obscure/complex networking setups
where entries in /etc/hosts end up pointing to IPs local to a
container/machine but they want nested clients to point to a different
IP.

My imagination is limited, but I cannot currently think of any situation
in which this would apply to engine containers; the networking settings
of the engine itself should dictate the networking settings of the
containers so any aliases found in the engine's /etc/hosts can apply to
resolving names queried by containers.
* To be clear: this does *not* result in containers' /etc/hosts having
  different values from before. It just makes all names resolvable by
  the engine also resolvable by containers, which is consistent with the
  behavior of DNS names outside of aliases.

Signed-off-by: Erik Sipsma <[email protected]>

* cleanup integ tests

The previous commit that tweaked dnsmasq conf enables just the squid
alias to be used for the proxy endpoint, which massively simplifies the
rest of the test.

Signed-off-by: Erik Sipsma <[email protected]>

* add integ tests that use basic auth for proxy

Signed-off-by: Erik Sipsma <[email protected]>

* add integ tests for NO_PROXY

Signed-off-by: Erik Sipsma <[email protected]>

* add support for GOPROXY setting on the Go SDK

This is implemented in such a way that it's a bit more generic but
hidden from external clients for now.

The overall plumbing is:
1. A hidden (via `__` prefix) API on Container for specifying env vars
   that should be inherited from the engine container, where they must
   be prefixed with `_DAGGER_ENGINE_SYSTEMENV_` (e.g.
   `_DAGGER_ENGINE_SYSTEMENV_GOPROXY`).
2. The Container implementation sets metadata that the Executor needs
   via LLB *metadata*. Importantly, this is very different from the
   previous `ftp_proxy` hack in that this is just metadata and thus
   doesn't impact LLB digests. We specifically use the `Description` LLB
   metadata, which is explicitly doc'd as never being interpreted by the
   solver and thus safe to use for this.
3. The Worker's ResolveOp method registers the solver.Vertex (which
   includes that metadata) so that it can be read by the Executor as
   needed.
4. The Executor uses that metadata to set any requested env vars from
   the system.

Once we've fleshed out the design of support for these system env vars
we can open up the API.

This plumbing of executorMetadata is also intended to be easy to re-use
for any other needs that arise in the future similar to this.

Integ tests are also included here with a pretty nice GOPROXY
implementation usable a library.

Signed-off-by: Erik Sipsma <[email protected]>

* chore: tidy go.mod

Signed-off-by: Justin Chadwell <[email protected]>

* add clarifying comments to proxy env setting code

Signed-off-by: Erik Sipsma <[email protected]>

---------

Signed-off-by: Erik Sipsma <[email protected]>
Signed-off-by: Justin Chadwell <[email protected]>
Co-authored-by: Justin Chadwell <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants