-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cekit-3.3.1 creates more layers than 3.2.0 and fails docker build locally #593
Comments
Note with 3.3.1 the total number of steps reported by cekit is 213. With 3.2.0 this is 153. The failure:
|
Last bit of info:
The 3.2.0 ones are:
But 3.3.1 has:
|
Correct, this should be actually looked at as a feature :) Let me explain why. Please take a look at: #544. This request (a valid one) was to start reusing cached layers when building the image. Previously we added all modules and artifacts at the beginning. These were processed by modules later. This meant that if you do literally any change to any module, the whole image would be rebuilt from scratch. Depending on the image size and number of modules - it could take a lot of time. But there is an already built-in solution for it which is layer caching. So, the implementation was changed from modules are processed as a unit besides artifacts and the modules code itself to: modules are processed as a unit. This means that module code is added when the module is processed, same with artifacts defined in the module. I don't think there is a solution for it besides trying to optimize modules. I have a feeling that your modules are too fine-grained. This leads to this high number of layers. I've investigated a bit the Dockerfiles you attached (thanks!) and I found that these Dockerfiles create following filesystem layers (we're interested only in instructions that create filesystem layers:
We need to add a few more layers, because of the base image. In both cases it should be: 2 which gives: 65 for CEKit 3.2 and 125 for CEKit 3.3. Unfortunately I cannot check if these numbers are correct, but I think these are. Either case. The stacktrace on CEKit 3.3 is failing after last step that created a filesystem layer:
This created a layer (successfully) and then, when it tries to run Now, I investigated the Docker source code (that is shipped with RHEL 7.6) and here is the tree: https://github.com/moby/moby/tree/7f2769b9e0572f62730d91e79e674efd59b7e234. If we look at https://github.com/moby/moby/blob/7f2769b9e0572f62730d91e79e674efd59b7e234/layer/layer_store.go#L26. We will find you that the layer limit is set to 125... which is the exactly number of layers you have in your image! Here is where it is failing: https://github.com/moby/moby/blob/7f2769b9e0572f62730d91e79e674efd59b7e234/layer/layer_store.go#L266-L269 This means that if you would have one less layer, your image would build properly... |
@goldmann Thanks for the details, that makes sense, and I definetly like the docker cache reuse. We'll have to have a look at the modules structure as it is now then and see where we can trim some things down -- I certainly don't just want to be constantly 1 under the limit :) |
I agree, cache reuse is really good. |
I guess other question is why we actually have such limit in Docker in first place. I'm pretty sure that most (all?) storage drivers can handle it easily. |
See: #613 After moving modules around somewhat and trying to create some headroom, I took another look at the Dockerfile. The inflation is mostly due to some modules we use that don't actual have content that needs copying (they set env vars mostly.) These still end up having a COPY added for them, and making a docker layer. With the change above in the template, we avoid 12 of these, speeding up the build and the squash too :) Example of this is here: https://github.com/jboss-openshift/cct_module/blob/master/jboss/container/java/jvm/api/module.yaml With the change in place, 12 COPYs of modules that contain only module.yaml and perhaps a README are skipped. The one remaining concern I have with this is if there is a legitimate use case for a module that contains an artifact, but no execution. Since the default action is to copy these into /tmp/artifacts/ I can't see there being much use in this, but I'd like others thoughts. |
Diff of before / after here:
|
I've verified all of the affected modules in the jboss-eap-cd image and they do not contain additional content, simply define env vars or install other dependant modules. |
your diff are adding only commented lines, is this correct? |
@spolti Yeah, Ignore that + stuff at the end, that's from something else I was experimenting with in the image's image.yaml. The relevant differences are in the - COPY lines. |
I'll just take that part out, its not-relevant at all to this change. |
yeah, as you stated, those copy instructions only copy module.yaml files. I didn't understand what you said here:
|
@spolti I was asking if there'd ever be a case where you'd have a module copy a file into /tmp/artifacts but not do anything with it, maybe expecting a later module to find it there and actually do something. I don't think this is a legitimate use case, and I don't think we use it anywhere, but I thought I'd point it out anyway. |
Note, I'll regen the diff against an image with an unmodified image.yaml I think, there are some other unrelated changes in that diff too... |
Diff from using an unmodified version of https://github.com/jboss-container-images/jboss-eap-7-openshift-image/blob/eap72-dev/image.yaml
|
This is a good thing and I agree with, I can remember a use case as well, and seems it saves a few layers :) |
Just mentioning I hit the same issue for CEKit 3.6.0 when building jboss eap 7 image locally. |
@ochaloup FYI this is really an issue with docker and not cekit, the docker layer limit is a bit low IMO. |
this can help to overcome it but, as mentioned on one of the comments there, this change might work on one machine but not on others. Or, maybe not, as the maxLayerDepth is hardcoded. |
Dockerfiles attached below. cekit-3.3.1 exceeds the docker max limit which looks to be around 224 or so, which is about 50% more than the corresponding build with 3.2.0 which in the last build I counted about 160 or so layers before squash.
I haven't looked at doing any optimization of the image yet, as we really just needed to get them built, but we can take a look at that.
Image.yaml: https://github.com/jboss-container-images/jboss-eap-7-openshift-image/blob/eap73-openjdk11-dev/image.yaml
Dockerfiles:
Dockerfile-3.3.1.txt
Dockerfile-3.2.0.txt
FYI @goldmann @spolti
The text was updated successfully, but these errors were encountered: