Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concourse resource versions disappear when resource type's image got updated #8848

Open
xtremerui opened this issue Oct 24, 2023 · 0 comments
Labels

Comments

@xtremerui
Copy link
Contributor

Summary

there is a bug in Concourse resource version mechanism that all versions of a certain resource could be deleted (at least not returning from API, see details later) when its resource type got updated i.e. a new resource type image is fetched.

Steps to reproduce

  1. Set a pipeline with following yml fly -t dev sp -p test -c test.yml
resource_types:
- name: mock-test
  type: registry-image
  source:
    repository: concourse/mock <-- could be any fork of Concourse mock-resource as long as you can push the image to docker hub
    tag: test
resources:
- name: some-resource
  type: mock-test
jobs:
- name: test
  plan:
  - get: some-resource
  - task: test
    config:
      platform: linux

      image_resource:
        type: registry-image
        source:
          repository: alpine
      inputs:
      - name: some-resource
      run:
        path: sh
        args:
        - -exc
        - |
          cat some-resource/version
  1. Run the job twice. The only version that would be returned by mock-resource is mock. So in the resource page of some-resource, it should show one version mock and if the version is expanded, it should show that version mock is the input of build 1 and 2.
  2. Run the command to let the resource to return couple more versions fly -t dev check-resource --resource test/some-resource --from version:foo fly -t dev check-resource --resource test/some-resource --from version:bar. Observe now version foo and bar are showing.
  3. Run the job the third time, it should use the latest version bar as input. Verify in version bar details, it shows build 3 while version foo shows nothing.
  4. Go to your local repo of mock-resource and update dockerfile with dummy change, then build and push the image
    docker build -t concourse/mock:test --build-arg base_image=concourse/resource-types-base-image-static:latest .
    docker push concourse/mock:test.
  5. Back to the resource page of some-resource, if the auto check hasn't ran, you can manually trigger it. If it did run, observe there is only version mock showing and its history still show it was the input of build 1 and 2.
  6. Run the fly check-resource command again for version foo and bar, you should see both versions show up again and the input history of bar also shows build 3.

Expected results

All 3 versions of some-resource should show after the resource type image update

Actual results

Only one version mock shows

Additional context

  • priority. When this bug sounds like a critical problem, it won't have much impact on most daily tasks since
    a) after the version lost, once the auto check runs, the resource will at least return the latest version (version mock in above case), and the returned version still maintains the build input history (input of build 1 and 2).
    b) if user wants to rerun an old build with an old version of the resource, they at least have a workaround by manually running fly check-resource to bring that specific version back (the exact version can be found on the build page) while all input history of the version is available.
  • scope. A general hypnosis about this bug is it might related to how Concourse GC resource caches (used for both resource and resource type). When an image SHA is fetched for a resource type, it gets a new version and thus the resource type cache for the old version of the resource type is then busted, where GC is kicked in. There might be some connection between the resource caches of versions of the resource and the resource type that resource is based with, so when GC runs for the busted resource type cache, the resource caches of resource versions also got busted and removed while the actual resource versions are still existing in the DB. In this way, even API couldn't find versions of the resource at first, once old version is checked back, it still holds info like build input history. One entry point of this problem could be looking for the lost version in the DB while UI doesn't show any.
  • observation. In Concourse CI, there is ubuntu-image resource that affected by this bug while there is also golang-linux that holds all versions since last year (where CI was wiped and redeployed). Its still unclear why these two resources behave differently.

Triaging info

  • Concourse version: master
  • Browser (if applicable):
  • Did this used to work? not sure
@xtremerui xtremerui added the bug label Oct 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant