Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix GitubActions Docker issue #318

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

wiktor2200
Copy link
Collaborator

No description provided.

@wiktor2200 wiktor2200 force-pushed the Fix-GitubActions-docker-issue branch 3 times, most recently from c77315c to ea19c67 Compare November 11, 2023 16:37
@wiktor2200 wiktor2200 added bug Something isn't working ci Continuous Integration github_actions Pull requests that update Github_actions code labels Nov 11, 2023
@wiktor2200 wiktor2200 force-pushed the Fix-GitubActions-docker-issue branch 6 times, most recently from 3e46c4e to cf92c7b Compare November 11, 2023 19:27
@wiktor2200
Copy link
Collaborator Author

wiktor2200 commented Nov 11, 2023

Hi! @staticdev @aalaesar I've tried to fix a problem with docker molecule test but I got run out of idea. Would you be able to take a look and see? Maybe you will have some other solutions for this problem?

@geerlingguy Sorry for bothering you, but maybe you have got any idea what could have gone wrong here? Have you ever seen such error in yours Ansible images?
https://github.com/nextcloud/ansible-collection-nextcloud-admin/actions/runs/6836211015/job/18590869076?pr=318#step:7:110

  failed: [localhost] (item={'failed': 0, 'started': 1, 'finished': 0, 'ansible_job_id': 'j801881025068.2108', 'results_file': '/home/runner/.ansible_async/j801881025068.2108', 'changed': True, 'item': {'cgroupns_mode': 'host', 'command': '', 'image': 'docker.io/geerlingguy/docker-debian12-ansible:latest', 'name': 'instance', 'pre_build_image': True, 'privileged': True, 'volumes': ['/sys/fs/cgroup:/sys/fs/cgroup:rw']}, 'ansible_loop_var': 'item'}) => {"ansible_job_id": "j801881025068.2108", "ansible_loop_var": "item", "attempts": 8, "changed": false, "finished": 1, "item": {"ansible_job_id": "j801881025068.2108", "ansible_loop_var": "item", "changed": true, "failed": 0, "finished": 0, "item": {"cgroupns_mode": "host", "command": "", "image": "docker.io/geerlingguy/docker-debian12-ansible:latest", "name": "instance", "pre_build_image": true, "privileged": true, "volumes": ["/sys/fs/cgroup:/sys/fs/cgroup:rw"]}, "results_file": "/home/runner/.ansible_async/j801881025068.2108", "started": 1}, "msg": "Error creating container: 500 Server Error for http+docker://localhost/v1.43/containers/create?name=instance: Internal Server Error (\"symlink /proc/mounts /var/lib/docker/fuse-overlayfs/4441cd54c476cdd29d6f1ded1e93781e3c3929ca7407bbc645bd90b92c4c22e2-init/merged/etc/mtab: file exists\")", "results_file": "/home/runner/.ansible_async/j801881025068.2108", "started": 1, "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

@aalaesar
Copy link
Member

aalaesar commented Nov 11, 2023

Hello @wiktor2200 thank you for taking some time to fix the CI.
I've also been trying to fix it on some other branch but with no success. 😞
Most of the time I suppose the issue is in our code as the ansible image used is popular and I couldn't find someone with a similar issue.
Let see if Jeff Geerling can help us 😉

Edit: just thought. maybe we are upgrading ansible toot fast with dependabot for us to follow ansibles/molecule changes.

Regards

@geerlingguy
Copy link

It looks like the error is:

Error creating container: 500 Server Error for http+docker://localhost/v1.43/containers/create?name=instance: Internal Server Error (\"symlink /proc/mounts /var/lib/docker/fuse-overlayfs/4441cd54c476cdd29d6f1ded1e93781e3c3929ca7407bbc645bd90b92c4c22e2-init/merged/etc/mtab: file exists\")

I've seen similar file mount issues in GitHub Actions sometimes, but haven't in the past few months. Is this only with debian12?

@wiktor2200
Copy link
Collaborator Author

Hello Jeff! thanks a lot for involvement, I really appreciate that :)

When we were searching for this issue, there is not many issues found, that's why I asked.
It occurs randomly in all of our Molecule tests scenarios (both Debian 11,12 and Ubuntu20.04, 22.04), we define scenarios this way:

strategy:
matrix:
distro: [debian12, debian11, ubuntu2204, ubuntu2004]
nc_version: [latest, nc26, nc25]

Then running it with:

MOLECULE_NC: ${{ matrix.nc_version }}

Molecule itself it defined here:

platforms:
- name: instance
image: "docker.io/geerlingguy/docker-${MOLECULE_DISTRO:-debian12}-ansible:latest"
cgroupns_mode: host
command: ${MOLECULE_COMMAND:-""}
volumes:
- /sys/fs/cgroup:/sys/fs/cgroup:rw
privileged: true
pre_build_image: true

And as it's matrix when once fails, rest are cancelled. In this PR I've tried to clean docker cache (inspired with your old blog post: https://www.jeffgeerling.com/blog/2018/testing-your-ansible-roles-molecule) and then molecule reset when docker system prune didn't help.

@aalaesar
Copy link
Member

aalaesar commented Nov 16, 2023

Hello there
@wiktor2200 found this subject on Linux containers forum that is looking much like our issue.
Is there a way to check if our github actions are running on top of of LXD?

@staticdev
Copy link
Collaborator

hello all, I have been super busy with some other ansible issues, construction (like @geerlingguy =p) and don't really understand why this issue is happening. most my roles are tested against the same images and I don't have such an error. I would say to try using podman instead of docker since I mostly replaced docker for podman now. it is an alternative solution

@@ -44,9 +44,9 @@ jobs:

- name: Install test dependencies
run: |
python3 -m pip install --constraint=.github/workflows/constraints.txt ansible 'molecule-plugins[docker]' docker netaddr
python3 -m pip install --constraint=.github/workflows/constraints.txt ansible 'molecule-plugins[podman]' podman netaddr
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding podman is not necessary

@@ -16,8 +16,7 @@ jobs:
fetch-depth: 0

- name: Run ansible-lint
# replace `main` with any valid ref, or tags like `v6`
uses: ansible/ansible-lint-action@v6
uses: ansible/[email protected]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is making some new complains about linting.. at least I see the logs from latest execution showing 2 errors. I would keep it separate from this PR.

@@ -28,6 +28,10 @@ jobs:
with:
path: "nextcloud.ansible-collection-nextcloud-admin"

- name: Clean docker cache
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those lines also do not make sense when you are testing podman.

@staticdev
Copy link
Collaborator

@wiktor2200 Thanks for trying it out. I saw some potential issues with current state of the PR on comments.

@aalaesar
Copy link
Member

aalaesar commented Feb 7, 2024

Hello there.
I noticed that we are not running on this issue anymore now..... somehow the issue disapeared...
I'll keep the Pr in draft for now until we are confident the issue is gone for good.
Regards

@aalaesar aalaesar marked this pull request as draft February 7, 2024 08:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working ci Continuous Integration github_actions Pull requests that update Github_actions code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants