You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
there is a bug in Concourse resource version mechanism that all versions of a certain resource could be deleted (at least not returning from API, see details later) when its resource type got updated i.e. a new resource type image is fetched.
Steps to reproduce
Set a pipeline with following yml fly -t dev sp -p test -c test.yml
resource_types:
- name: mock-test
type: registry-image
source:
repository: concourse/mock <-- could be any fork of Concourse mock-resource as long as you can push the image to docker hub
tag: test
resources:
- name: some-resource
type: mock-test
jobs:
- name: test
plan:
- get: some-resource
- task: test
config:
platform: linux
image_resource:
type: registry-image
source:
repository: alpine
inputs:
- name: some-resource
run:
path: sh
args:
- -exc
- |
cat some-resource/version
Run the job twice. The only version that would be returned by mock-resource is mock. So in the resource page of some-resource, it should show one version mock and if the version is expanded, it should show that version mock is the input of build 1 and 2.
Run the command to let the resource to return couple more versions fly -t dev check-resource --resource test/some-resource --from version:foofly -t dev check-resource --resource test/some-resource --from version:bar. Observe now version foo and bar are showing.
Run the job the third time, it should use the latest version bar as input. Verify in version bar details, it shows build 3 while version foo shows nothing.
Go to your local repo of mock-resource and update dockerfile with dummy change, then build and push the image docker build -t concourse/mock:test --build-arg base_image=concourse/resource-types-base-image-static:latest . docker push concourse/mock:test.
Back to the resource page of some-resource, if the auto check hasn't ran, you can manually trigger it. If it did run, observe there is only version mock showing and its history still show it was the input of build 1 and 2.
Run the fly check-resource command again for version foo and bar, you should see both versions show up again and the input history of bar also shows build 3.
Expected results
All 3 versions of some-resource should show after the resource type image update
Actual results
Only one version mock shows
Additional context
priority. When this bug sounds like a critical problem, it won't have much impact on most daily tasks since
a) after the version lost, once the auto check runs, the resource will at least return the latest version (version mock in above case), and the returned version still maintains the build input history (input of build 1 and 2).
b) if user wants to rerun an old build with an old version of the resource, they at least have a workaround by manually running fly check-resource to bring that specific version back (the exact version can be found on the build page) while all input history of the version is available.
scope. A general hypnosis about this bug is it might related to how Concourse GC resource caches (used for both resource and resource type). When an image SHA is fetched for a resource type, it gets a new version and thus the resource type cache for the old version of the resource type is then busted, where GC is kicked in. There might be some connection between the resource caches of versions of the resource and the resource type that resource is based with, so when GC runs for the busted resource type cache, the resource caches of resource versions also got busted and removed while the actual resource versions are still existing in the DB. In this way, even API couldn't find versions of the resource at first, once old version is checked back, it still holds info like build input history. One entry point of this problem could be looking for the lost version in the DB while UI doesn't show any.
observation. In Concourse CI, there is ubuntu-image resource that affected by this bug while there is also golang-linux that holds all versions since last year (where CI was wiped and redeployed). Its still unclear why these two resources behave differently.
Triaging info
Concourse version: master
Browser (if applicable):
Did this used to work? not sure
The text was updated successfully, but these errors were encountered:
Summary
there is a bug in Concourse resource version mechanism that all versions of a certain resource could be deleted (at least not returning from API, see details later) when its resource type got updated i.e. a new resource type image is fetched.
Steps to reproduce
fly -t dev sp -p test -c test.yml
mock
. So in the resource page ofsome-resource
, it should show one versionmock
and if the version is expanded, it should show that versionmock
is the input of build 1 and 2.fly -t dev check-resource --resource test/some-resource --from version:foo
fly -t dev check-resource --resource test/some-resource --from version:bar
. Observe now versionfoo
andbar
are showing.bar
as input. Verify in versionbar
details, it shows build 3 while versionfoo
shows nothing.docker build -t concourse/mock:test --build-arg base_image=concourse/resource-types-base-image-static:latest .
docker push concourse/mock:test
.some-resource
, if the auto check hasn't ran, you can manually trigger it. If it did run, observe there is only versionmock
showing and its history still show it was the input of build 1 and 2.foo
andbar
, you should see both versions show up again and the input history ofbar
also shows build 3.Expected results
All 3 versions of
some-resource
should show after the resource type image updateActual results
Only one version
mock
showsAdditional context
a) after the version lost, once the auto check runs, the resource will at least return the latest version (version
mock
in above case), and the returned version still maintains the build input history (input of build 1 and 2).b) if user wants to rerun an old build with an old version of the resource, they at least have a workaround by manually running fly check-resource to bring that specific version back (the exact version can be found on the build page) while all input history of the version is available.
Triaging info
The text was updated successfully, but these errors were encountered: