Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cekit-3.3.1 creates more layers than 3.2.0 and fails docker build locally #593

Open
luck3y opened this issue Aug 20, 2019 · 20 comments
Open

Comments

@luck3y
Copy link
Contributor

luck3y commented Aug 20, 2019

Dockerfiles attached below. cekit-3.3.1 exceeds the docker max limit which looks to be around 224 or so, which is about 50% more than the corresponding build with 3.2.0 which in the last build I counted about 160 or so layers before squash.

I haven't looked at doing any optimization of the image yet, as we really just needed to get them built, but we can take a look at that.

Image.yaml: https://github.com/jboss-container-images/jboss-eap-7-openshift-image/blob/eap73-openjdk11-dev/image.yaml

Dockerfiles:

Dockerfile-3.3.1.txt
Dockerfile-3.2.0.txt

FYI @goldmann @spolti

@luck3y luck3y added status/review Sheduled for a review type/bug labels Aug 20, 2019
@luck3y
Copy link
Contributor Author

luck3y commented Aug 20, 2019

Note with 3.3.1 the total number of steps reported by cekit is 213. With 3.2.0 this is 153.

The failure:

2019-08-20 17:44:37,225 cekit        INFO     Docker: Step 211/213 : USER 185
2019-08-20 17:44:37,412 cekit        INFO     Docker: ---> Running in cd8329cd7838
2019-08-20 17:44:37,467 cekit        ERROR    You can look inside the failed image by running 'docker run --rm -ti 119b9e61215f bash'
2019-08-20 17:44:37,467 cekit        ERROR    ('Image build failed, see logs above.', CekitError("Image build failed: 'max depth exceeded'"))
Traceback (most recent call last):
  File "/usr/lib/python3.7/site-packages/cekit/builders/docker_builder.py", line 89, in _build_with_docker
    raise CekitError("Image build failed: '{}'".format(error_message))
cekit.errors.CekitError: Image build failed: 'max depth exceeded'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.7/site-packages/cekit/cli.py", line 334, in run
    command.execute()
  File "/usr/lib/python3.7/site-packages/cekit/builder.py", line 66, in execute
    self.run()
  File "/usr/lib/python3.7/site-packages/cekit/builders/docker_builder.py", line 242, in run
    image_id = self._build_with_docker(docker_client)
  File "/usr/lib/python3.7/site-packages/cekit/builders/docker_builder.py", line 150, in _build_with_docker
    raise CekitError(msg, ex)
cekit.errors.CekitError: ('Image build failed, see logs above.', CekitError("Image build failed: 'max depth exceeded'"))

@luck3y
Copy link
Contributor Author

luck3y commented Aug 20, 2019

Last bit of info:

$ grep 'COPY\|ADD' Dockerfile-3.2.0.txt | wc -l
3

$ grep 'COPY\|ADD' Dockerfile-3.3.1.txt  | wc -l
63

The 3.2.0 ones are:

COPY repos/content_sets_odcs.repo \
COPY modules /tmp/scripts/
COPY \

But 3.3.1 has:

COPY repos/content_sets_odcs.repo \
COPY modules/jboss.container.user /tmp/scripts/jboss.container.user
COPY modules/jboss.container.openjdk.jdk /tmp/scripts/jboss.container.openjdk.jdk
COPY modules/jboss.container.maven.35.bash /tmp/scripts/jboss.container.maven.35.bash
COPY \
COPY modules/eap-7.3.0.Beta /tmp/scripts/eap-7.3.0.Beta
COPY modules/eap-install-cleanup /tmp/scripts/eap-install-cleanup
COPY modules/eap-7.3-latest /tmp/scripts/eap-7.3-latest
COPY modules/jboss.container.java.jvm.api /tmp/scripts/jboss.container.java.jvm.api
COPY modules/jboss.container.proxy.api /tmp/scripts/jboss.container.proxy.api
COPY modules/jboss.container.java.proxy.bash /tmp/scripts/jboss.container.java.proxy.bash
COPY modules/jboss.container.java.jvm.bash /tmp/scripts/jboss.container.java.jvm.bash
COPY modules/os-java-run /tmp/scripts/os-java-run
COPY modules/dynamic-resources /tmp/scripts/dynamic-resources
COPY modules/jboss.container.s2i.core.api /tmp/scripts/jboss.container.s2i.core.api
COPY modules/jboss.container.maven.s2i.api /tmp/scripts/jboss.container.maven.s2i.api
COPY modules/jboss.container.s2i.core.bash /tmp/scripts/jboss.container.s2i.core.bash
COPY modules/jboss.container.maven.api /tmp/scripts/jboss.container.maven.api
COPY modules/jboss.container.util.logging.bash /tmp/scripts/jboss.container.util.logging.bash
COPY modules/jboss.container.maven.default.bash /tmp/scripts/jboss.container.maven.default.bash
COPY modules/jboss.container.maven.s2i.bash /tmp/scripts/jboss.container.maven.s2i.bash
COPY modules/jboss.container.eap.s2i.bash /tmp/scripts/jboss.container.eap.s2i.bash
COPY modules/jboss.container.jolokia.api /tmp/scripts/jboss.container.jolokia.api
COPY \
COPY modules/jboss.container.jolokia.bash /tmp/scripts/jboss.container.jolokia.bash
COPY modules/os-java-jolokia /tmp/scripts/os-java-jolokia
COPY modules/jboss.eap.cd.jolokia /tmp/scripts/jboss.eap.cd.jolokia
COPY modules/os-eap7-openshift /tmp/scripts/os-eap7-openshift
COPY modules/jboss.eap.config.openshift /tmp/scripts/jboss.eap.config.openshift
COPY \
COPY modules/jboss.eap.cd.openshift.modules /tmp/scripts/jboss.eap.cd.openshift.modules
COPY \
COPY modules/os-eap7-ping /tmp/scripts/os-eap7-ping
COPY \
COPY modules/os-eap-activemq-rar /tmp/scripts/os-eap-activemq-rar
COPY modules/os-eap-launch /tmp/scripts/os-eap-launch
COPY modules/os-eap-node-name /tmp/scripts/os-eap-node-name
COPY modules/os-eap-migration /tmp/scripts/os-eap-migration
COPY modules/os-eap7-launch /tmp/scripts/os-eap7-launch
COPY modules/jboss.eap.cd.openshift.launch /tmp/scripts/jboss.eap.cd.openshift.launch
COPY modules/os-eap-datasource /tmp/scripts/os-eap-datasource
COPY modules/jboss.eap.cd.logging /tmp/scripts/jboss.eap.cd.logging
COPY modules/jboss.eap.config.mp-config /tmp/scripts/jboss.eap.config.mp-config
COPY modules/jboss.eap.config.jgroups /tmp/scripts/jboss.eap.config.jgroups
COPY modules/jboss.eap.config.elytron /tmp/scripts/jboss.eap.config.elytron
COPY modules/jboss.eap.config.tracing /tmp/scripts/jboss.eap.config.tracing
COPY modules/os-eap-probes /tmp/scripts/os-eap-probes
COPY modules/os-eap-sso /tmp/scripts/os-eap-sso
COPY \
COPY modules/os-eap70-sso /tmp/scripts/os-eap70-sso
COPY modules/os-eap-deployment-scanner /tmp/scripts/os-eap-deployment-scanner
COPY modules/os-eap-extensions /tmp/scripts/os-eap-extensions
COPY modules/openshift-layer /tmp/scripts/openshift-layer
COPY modules/openshift-passwd /tmp/scripts/openshift-passwd
COPY modules/os-logging /tmp/scripts/os-logging
COPY modules/jboss.container.prometheus.api /tmp/scripts/jboss.container.prometheus.api
COPY \
COPY modules/jboss.container.prometheus.bash /tmp/scripts/jboss.container.prometheus.bash
COPY modules/jboss.container.eap.prometheus.agent /tmp/scripts/jboss.container.eap.prometheus.agent
COPY modules/jboss.container.eap.prometheus.config /tmp/scripts/jboss.container.eap.prometheus.config
COPY modules/os-eap-txnrecovery.bash /tmp/scripts/os-eap-txnrecovery.bash
COPY modules/os-eap-txnrecovery.run /tmp/scripts/os-eap-txnrecovery.run
COPY modules/os-eap-python /tmp/scripts/os-eap-python

@goldmann
Copy link
Contributor

Correct, this should be actually looked at as a feature :) Let me explain why.

Please take a look at: #544.

This request (a valid one) was to start reusing cached layers when building the image. Previously we added all modules and artifacts at the beginning. These were processed by modules later. This meant that if you do literally any change to any module, the whole image would be rebuilt from scratch. Depending on the image size and number of modules - it could take a lot of time. But there is an already built-in solution for it which is layer caching.

So, the implementation was changed from modules are processed as a unit besides artifacts and the modules code itself to: modules are processed as a unit. This means that module code is added when the module is processed, same with artifacts defined in the module.

I don't think there is a solution for it besides trying to optimize modules. I have a feeling that your modules are too fine-grained. This leads to this high number of layers.

I've investigated a bit the Dockerfiles you attached (thanks!) and I found that these Dockerfiles create following filesystem layers (we're interested only in instructions that create filesystem layers: ^COPY|^RUN|^ADD):

  • CEKit 3.2: 63 (153 steps overall)
  • CEKit 3.3: 123 (213 steps overall)

We need to add a few more layers, because of the base image. In both cases it should be: 2 which gives: 65 for CEKit 3.2 and 125 for CEKit 3.3.

Unfortunately I cannot check if these numbers are correct, but I think these are. Either case. The stacktrace on CEKit 3.3 is failing after last step that created a filesystem layer:

# Remove custom repo files
RUN rm /etc/yum.repos.d/content_sets_odcs.repo

This created a layer (successfully) and then, when it tries to run USER 185, it fails.

Now, I investigated the Docker source code (that is shipped with RHEL 7.6) and here is the tree: https://github.com/moby/moby/tree/7f2769b9e0572f62730d91e79e674efd59b7e234.

If we look at https://github.com/moby/moby/blob/7f2769b9e0572f62730d91e79e674efd59b7e234/layer/layer_store.go#L26. We will find you that the layer limit is set to 125... which is the exactly number of layers you have in your image! Here is where it is failing: https://github.com/moby/moby/blob/7f2769b9e0572f62730d91e79e674efd59b7e234/layer/layer_store.go#L266-L269

This means that if you would have one less layer, your image would build properly...

@luck3y
Copy link
Contributor Author

luck3y commented Aug 21, 2019

@goldmann Thanks for the details, that makes sense, and I definetly like the docker cache reuse.

We'll have to have a look at the modules structure as it is now then and see where we can trim some things down -- I certainly don't just want to be constantly 1 under the limit :)

@spolti
Copy link
Contributor

spolti commented Aug 21, 2019

I agree, cache reuse is really good.
For now, we will need to find way to trim/optimize how the modules are being installed.

@goldmann
Copy link
Contributor

I guess other question is why we actually have such limit in Docker in first place. I'm pretty sure that most (all?) storage drivers can handle it easily.

@luck3y
Copy link
Contributor Author

luck3y commented Sep 11, 2019

See: #613

After moving modules around somewhat and trying to create some headroom, I took another look at the Dockerfile. The inflation is mostly due to some modules we use that don't actual have content that needs copying (they set env vars mostly.) These still end up having a COPY added for them, and making a docker layer. With the change above in the template, we avoid 12 of these, speeding up the build and the squash too :)

Example of this is here: https://github.com/jboss-openshift/cct_module/blob/master/jboss/container/java/jvm/api/module.yaml

With the change in place, 12 COPYs of modules that contain only module.yaml and perhaps a README are skipped.

The one remaining concern I have with this is if there is a legitimate use case for a module that contains an artifact, but no execution. Since the default action is to copy these into /tmp/artifacts/ I can't see there being much use in this, but I'd like others thoughts.

@luck3y
Copy link
Contributor Author

luck3y commented Sep 11, 2019

Diff of before / after here:

--- /home/kwills/df/Dockerfile.1	2019-09-11 12:02:30.153532746 -0500
+++ ./target/image/Dockerfile	2019-09-11 14:49:30.733512419 -0500
@@ -138,22 +138,16 @@
 
 ###### START module 'eap-cd-latest:1.0'
 ###### \
-        # Copy 'eap-cd-latest' module content
-        COPY modules/eap-cd-latest /tmp/scripts/eap-cd-latest
 ###### /
 ###### END module 'eap-cd-latest:1.0'
 
 ###### START module 'jboss.container.java.jvm.api:1.0'
 ###### \
-        # Copy 'jboss.container.java.jvm.api' module content
-        COPY modules/jboss.container.java.jvm.api /tmp/scripts/jboss.container.java.jvm.api
 ###### /
 ###### END module 'jboss.container.java.jvm.api:1.0'
 
 ###### START module 'jboss.container.proxy.api:2.0'
 ###### \
-        # Copy 'jboss.container.proxy.api' module content
-        COPY modules/jboss.container.proxy.api /tmp/scripts/jboss.container.proxy.api
 ###### /
 ###### END module 'jboss.container.proxy.api:2.0'
 
@@ -187,13 +181,6 @@
 ###### /
 ###### END module 'jboss.container.java.jvm.bash:1.0'
 
-###### START module 'os-java-run:1.0'
-###### \
-        # Copy 'os-java-run' module content
-        COPY modules/os-java-run /tmp/scripts/os-java-run
-###### /
-###### END module 'os-java-run:1.0'
-
 ###### START module 'dynamic-resources:1.0'
 ###### \
         # Copy 'dynamic-resources' module content
@@ -206,8 +193,6 @@
 
 ###### START module 'jboss.container.s2i.core.api:1.0'
 ###### \
-        # Copy 'jboss.container.s2i.core.api' module content
-        COPY modules/jboss.container.s2i.core.api /tmp/scripts/jboss.container.s2i.core.api
         # Set 'jboss.container.s2i.core.api' module defined environment variables
         ENV \
             S2I_SOURCE_DEPLOYMENTS_FILTER="*" 
@@ -221,8 +206,6 @@
 
 ###### START module 'jboss.container.maven.s2i.api:1.0'
 ###### \
-        # Copy 'jboss.container.maven.s2i.api' module content
-        COPY modules/jboss.container.maven.s2i.api /tmp/scripts/jboss.container.maven.s2i.api
 ###### /
 ###### END module 'jboss.container.maven.s2i.api:1.0'
 
@@ -241,8 +224,6 @@
 
 ###### START module 'jboss.container.maven.api:1.0'
 ###### \
-        # Copy 'jboss.container.maven.api' module content
-        COPY modules/jboss.container.maven.api /tmp/scripts/jboss.container.maven.api
 ###### /
 ###### END module 'jboss.container.maven.api:1.0'
 
@@ -307,8 +288,6 @@
 
 ###### START module 'jboss.container.jolokia.api:1.0'
 ###### \
-        # Copy 'jboss.container.jolokia.api' module content
-        COPY modules/jboss.container.jolokia.api /tmp/scripts/jboss.container.jolokia.api
         # Set 'jboss.container.jolokia.api' module defined environment variables
         ENV \
             AB_JOLOKIA_AUTH_OPENSHIFT="true" \
@@ -347,8 +326,6 @@
 
 ###### START module 'os-java-jolokia:1.0'
 ###### \
-        # Copy 'os-java-jolokia' module content
-        COPY modules/os-java-jolokia /tmp/scripts/os-java-jolokia
 ###### /
 ###### END module 'os-java-jolokia:1.0'
 
@@ -569,13 +546,6 @@
 ###### /
 ###### END module 'os-eap-probes:2.0'
 
-###### START module 'jboss-maven:1.0'
-###### \
-        # Copy 'jboss-maven' module content
-        COPY modules/jboss-maven /tmp/scripts/jboss-maven
-###### /
-###### END module 'jboss-maven:1.0'
-
 ###### START module 'os-eap-sso:1.0'
 ###### \
         # Copy 'os-eap-sso' module content
@@ -643,22 +613,16 @@
 
 ###### START module 'openshift-passwd:1.0'
 ###### \
-        # Copy 'openshift-passwd' module content
-        COPY modules/openshift-passwd /tmp/scripts/openshift-passwd
 ###### /
 ###### END module 'openshift-passwd:1.0'
 
 ###### START module 'os-logging:1.0'
 ###### \
-        # Copy 'os-logging' module content
-        COPY modules/os-logging /tmp/scripts/os-logging
 ###### /
 ###### END module 'os-logging:1.0'
 
 ###### START module 'jboss.container.prometheus.api:1.0'
 ###### \
-        # Copy 'jboss.container.prometheus.api' module content
-        COPY modules/jboss.container.prometheus.api /tmp/scripts/jboss.container.prometheus.api
 ###### /
 ###### END module 'jboss.container.prometheus.api:1.0'

@luck3y
Copy link
Contributor Author

luck3y commented Sep 11, 2019

I've verified all of the affected modules in the jboss-eap-cd image and they do not contain additional content, simply define env vars or install other dependant modules.

@spolti
Copy link
Contributor

spolti commented Sep 11, 2019

your diff are adding only commented lines, is this correct?

@luck3y
Copy link
Contributor Author

luck3y commented Sep 11, 2019

your diff are adding only commented lines, is this correct?

@spolti Yeah, Ignore that + stuff at the end, that's from something else I was experimenting with in the image's image.yaml. The relevant differences are in the - COPY lines.

@luck3y
Copy link
Contributor Author

luck3y commented Sep 11, 2019

I'll just take that part out, its not-relevant at all to this change.

@spolti
Copy link
Contributor

spolti commented Sep 11, 2019

yeah, as you stated, those copy instructions only copy module.yaml files.

I didn't understand what you said here:

The one remaining concern I have with this is if there is a legitimate use case for a module that contains an artifact, but no execution. Since the default action is to copy these into /tmp/artifacts/ I can't see there being much use in this, but I'd like others thoughts.

@luck3y
Copy link
Contributor Author

luck3y commented Sep 11, 2019

@spolti I was asking if there'd ever be a case where you'd have a module copy a file into /tmp/artifacts but not do anything with it, maybe expecting a later module to find it there and actually do something. I don't think this is a legitimate use case, and I don't think we use it anywhere, but I thought I'd point it out anyway.

@luck3y
Copy link
Contributor Author

luck3y commented Sep 11, 2019

Note, I'll regen the diff against an image with an unmodified image.yaml I think, there are some other unrelated changes in that diff too...

@luck3y
Copy link
Contributor Author

luck3y commented Sep 11, 2019

Diff from using an unmodified version of https://github.com/jboss-container-images/jboss-eap-7-openshift-image/blob/eap72-dev/image.yaml

$ diff -ur ~/df/Dockerfile.1 ./target/image/Dockerfile 
--- /home/kwills/df/Dockerfile.1	2019-09-11 12:02:30.153532746 -0500
+++ ./target/image/Dockerfile	2019-09-11 15:27:01.680349097 -0500
@@ -138,22 +138,16 @@
 
 ###### START module 'eap-cd-latest:1.0'
 ###### \
-        # Copy 'eap-cd-latest' module content
-        COPY modules/eap-cd-latest /tmp/scripts/eap-cd-latest
 ###### /
 ###### END module 'eap-cd-latest:1.0'
 
 ###### START module 'jboss.container.java.jvm.api:1.0'
 ###### \
-        # Copy 'jboss.container.java.jvm.api' module content
-        COPY modules/jboss.container.java.jvm.api /tmp/scripts/jboss.container.java.jvm.api
 ###### /
 ###### END module 'jboss.container.java.jvm.api:1.0'
 
 ###### START module 'jboss.container.proxy.api:2.0'
 ###### \
-        # Copy 'jboss.container.proxy.api' module content
-        COPY modules/jboss.container.proxy.api /tmp/scripts/jboss.container.proxy.api
 ###### /
 ###### END module 'jboss.container.proxy.api:2.0'
 
@@ -189,8 +183,6 @@
 
 ###### START module 'os-java-run:1.0'
 ###### \
-        # Copy 'os-java-run' module content
-        COPY modules/os-java-run /tmp/scripts/os-java-run
 ###### /
 ###### END module 'os-java-run:1.0'
 
@@ -206,8 +198,6 @@
 
 ###### START module 'jboss.container.s2i.core.api:1.0'
 ###### \
-        # Copy 'jboss.container.s2i.core.api' module content
-        COPY modules/jboss.container.s2i.core.api /tmp/scripts/jboss.container.s2i.core.api
         # Set 'jboss.container.s2i.core.api' module defined environment variables
         ENV \
             S2I_SOURCE_DEPLOYMENTS_FILTER="*" 
@@ -221,8 +211,6 @@
 
 ###### START module 'jboss.container.maven.s2i.api:1.0'
 ###### \
-        # Copy 'jboss.container.maven.s2i.api' module content
-        COPY modules/jboss.container.maven.s2i.api /tmp/scripts/jboss.container.maven.s2i.api
 ###### /
 ###### END module 'jboss.container.maven.s2i.api:1.0'
 
@@ -241,8 +229,6 @@
 
 ###### START module 'jboss.container.maven.api:1.0'
 ###### \
-        # Copy 'jboss.container.maven.api' module content
-        COPY modules/jboss.container.maven.api /tmp/scripts/jboss.container.maven.api
 ###### /
 ###### END module 'jboss.container.maven.api:1.0'
 
@@ -307,8 +293,6 @@
 
 ###### START module 'jboss.container.jolokia.api:1.0'
 ###### \
-        # Copy 'jboss.container.jolokia.api' module content
-        COPY modules/jboss.container.jolokia.api /tmp/scripts/jboss.container.jolokia.api
         # Set 'jboss.container.jolokia.api' module defined environment variables
         ENV \
             AB_JOLOKIA_AUTH_OPENSHIFT="true" \
@@ -347,8 +331,6 @@
 
 ###### START module 'os-java-jolokia:1.0'
 ###### \
-        # Copy 'os-java-jolokia' module content
-        COPY modules/os-java-jolokia /tmp/scripts/os-java-jolokia
 ###### /
 ###### END module 'os-java-jolokia:1.0'
 
@@ -571,8 +553,6 @@
 
 ###### START module 'jboss-maven:1.0'
 ###### \
-        # Copy 'jboss-maven' module content
-        COPY modules/jboss-maven /tmp/scripts/jboss-maven
 ###### /
 ###### END module 'jboss-maven:1.0'
 
@@ -643,22 +623,16 @@
 
 ###### START module 'openshift-passwd:1.0'
 ###### \
-        # Copy 'openshift-passwd' module content
-        COPY modules/openshift-passwd /tmp/scripts/openshift-passwd
 ###### /
 ###### END module 'openshift-passwd:1.0'
 
 ###### START module 'os-logging:1.0'
 ###### \
-        # Copy 'os-logging' module content
-        COPY modules/os-logging /tmp/scripts/os-logging
 ###### /
 ###### END module 'os-logging:1.0'
 
 ###### START module 'jboss.container.prometheus.api:1.0'
 ###### \
-        # Copy 'jboss.container.prometheus.api' module content
-        COPY modules/jboss.container.prometheus.api /tmp/scripts/jboss.container.prometheus.api
 ###### /
 ###### END module 'jboss.container.prometheus.api:1.0'

@spolti
Copy link
Contributor

spolti commented Sep 12, 2019

This is a good thing and I agree with, I can remember a use case as well, and seems it saves a few layers :)
thanks for pointing this out.

@ochaloup
Copy link

Just mentioning I hit the same issue for CEKit 3.6.0 when building jboss eap 7 image locally.
@luck3y helped me to workaround the trouble by guiding me to use virtualenv and CEKit version 3.2.1

@luck3y
Copy link
Contributor Author

luck3y commented Nov 26, 2019

@ochaloup FYI this is really an issue with docker and not cekit, the docker layer limit is a bit low IMO.

@spolti
Copy link
Contributor

spolti commented Jun 17, 2021

this can help to overcome it
https://stackoverflow.com/questions/33051108/how-to-get-around-the-linux-too-many-arguments-limit

but, as mentioned on one of the comments there, this change might work on one machine but not on others.

Or, maybe not, as the maxLayerDepth is hardcoded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants