Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Functions - BlobTrigger container scan should not skip past changes made during the scan #3014

Open
wants to merge 5 commits into
base: dev
Choose a base branch
from

Conversation

AartBluestoke
Copy link

When there are continuations to the container scan, the later pages should not advance the timestamp past when the first page of the scan occurs.

Without this change an update to a file on the first page that occurs while the later API calls are occurring might be missed.

old process:
0. old threshold

  1. write file v1.
  2. begin scan
  3. read file v1.
  4. write file v2,
  5. write occurs to another,
  6. fetch next batch of results from the container scan
  7. find 'another, t5'
  8. highlighted line of code sets the 'high water mark' to t5, skipping over , file v2 which wrote at t4.
  9. container scan ends
  10. next scan begins at t5, missing the update at t4

the correct 'high water mark' is the minimum of t5 and t2, when the scan began

new process:
0. old threshold

  1. write file v1.
  2. begin scan
  3. find 'file v1, t1', set high water mark to v1
  4. write file v2,
  5. write occurs to another,
  6. fetch next batch of results from the container scan
  7. find 'another, t5'
  8. the high water mark IS NOT updated
  9. container scan ends
  10. next scan begins at t1, finding the file written at t4.
  11. the scan finds the file written at t5 again, but rejects it via ETAG duplication.

AartBluestoke and others added 2 commits August 25, 2023 09:27
…hould not advance the timestamp past when the first page of the scan occurs.

Without this change an update to a file on the first page that occurs while the later API calls are occurring might be missed.
@AartBluestoke
Copy link
Author

@microsoft-github-policy-service agree

@AartBluestoke
Copy link
Author

PR raised in response to bug discussed in azure support ticket TrackingID#2308210030000318

@AartBluestoke AartBluestoke changed the title Bugs/no scan from future Functions - BlobTrigger container scan should not skip past changes made during the scan Aug 25, 2023
@@ -214,6 +214,7 @@ private void ThrowIfDisposed()
// if starting the cycle, reset the sweep time
if (continuationToken == null)
{
containerScanInfo.CurrentScanBeginTime = DateTime.UtcNow;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the best name here "Scan" or "Sweep"? (both words are used interchangeably in this code)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant