-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache extracted layers #3850
Comments
You are talking about pretty much this line here where we call Is it worth it? Sure subequent builds for from 26s to 9s, but the initial build still has to convert to layers and create the tar which then gets cached. The trade-off is the complexity. We would need to implement it on our own, as the cache mechanism we are using ggcr does not have support for caching that as far as I know. How would you indicate what the cache entry actually is? For example, when I currently try to get Let's say that we create the flattened image tar for amd64. That tar file has a sha256sum, so we could store it in the OCI layout under What would we use to reference that {
"mediaType": "application/vnd.oci.image.index.v1+json",
"size": 1785,
"digest": "sha256:32908513b9ad552eeab6720a69723b01d5e2a46fc67bd821b277fef3d6272de0",
"annotations": {
"org.opencontainers.image.ref.name": "linuxkit/init:97b398b5deab3fc62531fae833085c19d9f92a67"
}
}, I can see the lookup process for What would the process be for going from One other thing to keep in mind is that the linuxkit cache is expected to last another 12-18 months or so. Docker image cache is finally moving to containerd under the covers, which means support for multi-arch indexes and images stored in their native OCI format (not just expanded layers). It is experimental, so I expect another 6-9 months for them to have all the features we need, and another equal amount for sufficient adoption that we can drop the linuxkit cache entirely (which will make me very happy). If we can find a way to do this sanely, then by all means. |
@deitch thanks for all the feedback. First, I should be more clear about why I'd like to do that. On Docker Desktop, we have pretty heavy images that contain a lot of packages. When we rebuild the application after a change, we have a lot of caching to avoid doing the same thing twice. However, as soon a we have changed the code of a single package, one of the images has to be rebuilt and we'd like this to be as quick as possible. With the change I'd like to see, all the image of all the packages, expect the single one that needs to be rebuilt, are already in cache, whether it's the on-disk cache of linuxkit, or in a docker daemon. However, each of those images is present in the form of layers that we need to merge again and again, on each build. I'd like to cache that result, by image. On my machine, this accelerates a no-op build from 16s to 9s and that's pretty important because 99% of my builds will be able to leverage that cache. |
I pushed some demo code that is flawed in many ways here |
I get the purpose of it. You are saying that it is a frequent usage. Let's work with that. The design you propose in #3851 says, when I need an image, say,
It becomes a branch, rather than a simple, "read image as tar stream of flattened filesystem with layers applied."
|
Yes. All I added is the step that saves the expensive computation of the flatten image.
My goal was something like in
From the image name, I get an image ID or digest. This ID/digest becomes the key towards the blob. |
OK, I think I see where you are going. It does make sense. I would change the storage approach. I definitely would store them in the linuxkit cache. Let's continue our example from above, working with Thinking this through, I would:
Since the only customer for this OCI layout is linuxkit, we can do this without worrying about compatibility with other clients. As a first blush, I would think about having an // this is the existing one and points to the OCI image index
{
"mediaType": "application/vnd.oci.image.index.v1+json",
"size": 1785,
"digest": "sha256:32908513b9ad552eeab6720a69723b01d5e2a46fc67bd821b277fef3d6272de0",
"annotations": {
"org.opencontainers.image.ref.name": "linuxkit/init:97b398b5deab3fc62531fae833085c19d9f92a67"
}
},
// this is the new one we add and points to the "index of flattened tars"
{
"mediaType": "application/vnd.oci.image.index.flattened.v1+json",
"size": 1785,
"digest": "sha256:e24fb1cd1bd9346ca452853611e3e1688dc7ec3197fa91b4ef2726282c803028",
"annotations": {
"org.opencontainers.image.ref.name": "linuxkit/init:97b398b5deab3fc62531fae833085c19d9f92a67"
}
}, And the "index of flattened tars" looks like: {
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.index.flattened.v1+json",
"manifests": [
{
"mediaType": "application/vnd.oci.flattened.filesystem.tar",
"size": 20509696,
"digest": "sha256:794def3d3d731c5c5d1831ab362e2aa53556b0bcf8e1db1e5ff69d3a96ef84a6",
"platform": {
"architecture": "amd64",
"os": "linux"
}
}
]
} |
When a linuxkit image is built, the longest part is to merge image layers into single tarballs.
docker run
the image anddocker export
the resulting container.In both cases, we could cache the result locally. It would basically be a digest->single tarball cache.
I've prototyped this and my build goes from 26s to 9s.
wdyt?
The text was updated successfully, but these errors were encountered: