Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of CA certificates #378

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

translatenix
Copy link
Contributor

Instead of bundling Pkl's built-in CA certificates as a class path resource and loading them at runtime, pass them to the native image compiler as the default SSL context's trust store. This results in faster SSL initialization and is more consistent with how default certificates are handled when running on the JVM.

Further related improvements:

  • Change --ca-certificates and corresponding APIs to accept directories in addition to files. Passing a directory has the same effect as passing each of the directory's regular files.
  • Set the ~/.pkl/cacerts CLI default directly in CliBaseOptions.
  • Remove HttpClientBuilder methods addDefaultCliCertificates and addBuiltInCertificates.
  • Remove pkl-certs subproject and the optional dependencies on it.
  • Move PklCARoots.pem to pkl-cli/src/certs.
  • Rename occurrences of certificateFiles to certificatePaths.
  • Fix certificate related error messages that were missing an argument.
  • Prevent PklBugException if initialization of CliBaseOptions.httpClient fails.

@holzensp holzensp requested a review from bioball April 5, 2024 14:05
@translatenix
Copy link
Contributor Author

Could someone take a look at this? It’s a significant improvement. Once I found https://www.graalvm.org/latest/reference-manual/native-image/dynamic-features/CertificateManagement/#build-time-options, everything fell in place.

@bioball
Copy link
Contributor

bioball commented Apr 17, 2024

I haven't forgotten about this one. Will take a look when I get the chance.

@translatenix translatenix force-pushed the improve-certs branch 2 times, most recently from d7fb715 to c9806d9 Compare April 25, 2024 20:30
Copy link
Contributor

@bioball bioball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, using system properties to bake the certs into the trust store is a good change! Also, I'm down with getting rid of the pkl-certs distribution.

However, I think we should not change how the options works. For example, this is currently an error, and should continue to be an error:

pkl eval --ca-certificates not/a/valid/path.pem

Going by the same logic, if the provided argument is a directory and we scan it to include those certs, I would expect it to replace the set of trusted certs. An empty directory would logically mean: don't trust any certs.

Because of this, I don't think we should say that the default value of --ca-certificates is ~/.pkl/cacerts. Rather, we should keep it the way it is; if there is no --ca-certificates argument, then Pkl will look for it in ~/.pkl/cacerts, and otherwise fall back to its own certs.

Because of this, I don't think accepting directories as --ca-certificates actually provides that much value. With that said, though, I can go either way; am okay with keeping it if there's a good use-case for it.


Setting this option replaces the existing set of CA certificates bundled into the CLI.
Setting this option to anything other than a non-existing or empty directory
replaces the existing set of CA certificates bundled into the CLI.
Copy link
Contributor

@bioball bioball Apr 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is asking for hidden configuration errors. This will execute just fine, even if the path doesn't exist:

pkl eval --ca-certificates not/a/valid/path/

I think it's better to keep the way it works today.

More thoughts left in the review summary.

pkl-cli/src/main/kotlin/org/pkl/cli/Main.kt Outdated Show resolved Hide resolved
@translatenix
Copy link
Contributor Author

To recap the current behavior:

  • pkl eval bird.pkl --ca-certificates not/a/valid/path.pem fails if and when an HTTPS request is made.
  • If --ca-certificates is not specified and ~/.pkl/cacerts does not exist or is empty, the built-in certificates are used.
  • It's impossible to not trust any certificate. I also can't think of a use case for this. To disable HTTPS, --allowed-modules should be used.

The changes proposed in this PR have many upsides:

  1. Nice separation of concerns: pkl-core implements a generic certificate file/directory feature, and pkl-cli implements the ~/.pkl/cacerts default.
  2. Directory scanning, including error handling, happens in the same place and at the same time as certificate file loading. Everything is lazy.
  3. Removing addBuiltinCertificates and addDefaultCertificates reduces the API surface of HttpClientBuilder.
  4. The behavior and default of --ca-certificates/certificatePaths is easy to explain and document.

Given that a non-existing ~/.pkl/cacerts directory must be tolerated, I decided to keep things simple and also tolerate non-existing files. If you feel strongly that this isn't good enough, I can think of three solutions:

  1. Use separate options for certificate files and directories. Then non-existing files can be distinguished from non-existing directories and can be treated differently.
  2. Change HttpClient to fail if a certificate path (file or directory) does not exist. Change pkl-cli to only set the ~/.pkl/cacerts default if the directory exists.
  3. Move directory scanning to pkl-cli and make it eager.

I think I like (2) relatively best because it keeps most of the listed upsides. What do you think?

One more question: Are we certain that certificateUris is still needed? If not, I'd like to remove it until there is a proven need.

@bioball
Copy link
Contributor

bioball commented Apr 26, 2024

In 0.25 (the current release), pkl eval will error immediately if the given --ca-certificates files are invalid. I missed this--in the current dev version, it will only fail when the first HTTPS request is made. This feels like a behavior regression to me. In that case, perhaps we shouldn't initialize the HttpClient lazily? I think this is your option 3?

I'll do some benchmarking here; I'm curious how much time we're saving by trying to make things lazy.

One more question: Are we certain that certificateUris is still needed? If not, I'd like to remove it until there is a proven need.

I don't think this is needed anymore, now that we don't use classpath resources for embedding the cert files.

@translatenix
Copy link
Contributor Author

translatenix commented Apr 26, 2024

We discussed before that being lazy means that SSL initialization errors are deferred. The conclusion was that being lazy is in the Pkl spirit. Lazy SSL initialization is why 0.26 is much snappier than 0.25. The JVM also initializes SSL lazily. It’s known to be expensive.

@translatenix
Copy link
Contributor Author

I don't think this is needed anymore, now that we don't use classpath resources for embedding the cert files.

Removed.

@bioball
Copy link
Contributor

bioball commented Apr 26, 2024

You're right; initializing cert stuff takes quite a while. In particular, CertificateFactory#generateCertificates() takes ~160ms to initialize when passed a pretty standard cert bundle. I think that's expensive enough to justify "let's defer spawning until it's actually needed". In that case, option 2 indeed sounds like the best approach. I also think that unless there's a good use-case for accepting a directory in our CLI flag, let's go back to accepting just a cert bundle.

Also: I'm kind of shocked that it takes so along to initialize. I'm curious if there's some better approach here.

Another observation: SSLContext.getDefault() takes ~5ms to return on native, and about 100ms to return in jpkl.

@translatenix
Copy link
Contributor Author

I also think that unless there's a good use-case for accepting a directory in our CLI flag, let's go back to accepting just a cert bundle

I think that accepting a directory makes sense both from a user perspective (can use a directory other than ~/.pkl/cacerts) and from an internal perspective (see "upsides"). For comparison, cURL supports both files (--cacert) and directories (--capath).

Also: I'm kind of shocked that it takes so along to initialize. I'm curious if there's some better approach here.

Default SSL context with baked-in certificates and lazy SSL initialization is already a significant improvement compared to 0.25.

Perhaps SSL initialization time for user-provided certificates could be further improved by accepting a .p12 file instead of .pem files and changing the javax.net.ssl.trustStore property at runtime instead of creating our own SSL context. But this also has downsides.

https://www.graalvm.org/latest/reference-manual/native-image/dynamic-features/CertificateManagement/#runtime-options

@bioball
Copy link
Contributor

bioball commented Apr 29, 2024

I think that accepting a directory makes sense both from a user perspective (can use a directory other than ~/.pkl/cacerts) and from an internal perspective (see "upsides"). For comparison, cURL supports both files (--cacert) and directories (--capath).

Hmm... yeah, I don't feel strongly about this. Sounds fine to me to accept directories too, especially since both cURL and wget do. But I think let's follow their lead in this case, and make it a separate flag (I think maybe call it --ca-directory), which would also guard against unintentional configuration errors.

@translatenix
Copy link
Contributor Author

But I think let's follow their lead in this case, and make it a separate flag (I think maybe call it --ca-directory), which would also guard against unintentional configuration errors.

Which config errors are you trying to guard against? I feel a single option is more elegant and keeps the CLI/API smaller. Also --ca-certificates works well for both files and directories.

Instead of bundling Pkl's built-in CA certificates as a class path resource and loading them at runtime,
pass them to the native image compiler as the default SSL context's trust store.
This results in faster SSL initialization and is more consistent with how default certificates
are handled when running on the JVM.

Further related improvements:
- Change `--ca-certificates` and corresponding APIs to accept directories
  in addition to files. Passing a directory has the same effect as passing
  each of the directory's regular files.
- Set the `~/.pkl/cacerts` CLI default directly in `CliBaseOptions`.
- Remove HttpClientBuilder methods `addDefaultCliCertificates` and `addBuiltInCertificates`.
- Remove pkl-certs subproject and the optional dependencies on it.
- Move `PklCARoots.pem` to `pkl-cli/src/certs`.
- Rename occurrences of `certificateFiles` to `certificatePaths`.
- Fix certificate related error messages that were missing an argument.
- Prevent PklBugException if initialization of `CliBaseOptions.httpClient` fails.
Rationale: There is no longer a clear need for this feature.
@bioball
Copy link
Contributor

bioball commented Apr 30, 2024

The error would be accidentally setting the flag to a directory, when it was supposed to be a file, or vice versa.

I also like that:

  • This aligns with how curl and wget work
  • This is simply a new feature, rather than a change in behavior of an existing flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants