Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unpredictable paging of event logs api results in duplicated or missed events #3690

Open
1 task done
WaciX opened this issue Jan 21, 2024 · 2 comments
Open
1 task done
Labels

Comments

@WaciX
Copy link

WaciX commented Jan 21, 2024

Steps To Reproduce

  1. Login to Organization Vault
  2. Generate few Events - add few Items into default collection or add some groups
  3. Fetch events via api with /public/events, by specifying start date to be now minus 1 hour and end date to be current time.
  4. Generate few more Events
  5. Option A: Fetch more events via api with /public/events, by specifying start date to be the newest (first) fetched event date timestamp from step 3 and end date to be current time.
    (Fetch every single event by paging with the next result's continuation token)
  6. Option B: Fetch more events via api with /public/events, by specifying start date to be the end date from step 3 plus some seconds fraction from step 3 and end date to be current time.
    (Fetch every single event by paging with the next result's continuation token)
    Repeat this step for those variations:
    • by adding 0.1 second precision,
    • by adding 0.001 second precision (1 milliseconds) ,
    • by adding 0.000001 second precision (1 microsecond) ,
    • by adding 0.0000001 second precision (0.1 microsecond, the maximum precision for the date time type),

Expected Result

No duplicates or missed events fetched from api, regardless of the Option A or B

Actual Result

Option A gives you at least one duplicate in events fetched from api (if you expect just one duplicate, you are wrong, see for additional context)
Option B misses some event, depending on the precision - and it's a pure guess on what should be the precision (see for additional context)

Screenshots or Videos

No response

Additional Context

Api

Both start and end parameters of the /public/events api are inclusive.

Problem

Because of this there is no way to paginate through the events from the api and get the same events count that got generated in the first place.
Option A obviously gives you duplicate(s).
But the Option B is the one that is unpredictable, because the precision of the event date's is completely unpredictable.

Option B exploration

Let's say we fetched an event with date 2024-01-11T01:41:03.735438Z. There are always 7 digits in the second fraction (0.1 microsecond precision) - this is because of the maximum precision of the DateTime type from C# .NET (Bitwarden Server API implementation)
You might think, if increase the last (7th) digit by 1, so 2024-01-11T01:41:03.735439Z and specify that as start in the next /public/events api, you will not get the duplicate.
You are wrong, you still get the same event.
Changing the 6th or 5th digit results in the same behavior, you still retrieve the event.
The change in behavior is with the 4th digit increase by 1.
You might ask: "Ok, so the actual precision is 0.0001, problem solved, you no longer get the duplicates.". Again wrong, because you just potentially missed a different event, that have close timestamp, like 2024-01-11T01:41:03.735448Z.

Technical details - explanation of the 4 digits precision.

The actual problem comes from the solution used to query the events from DB and the sort of the events, to be in descending order, by the event's date - also the reason why the events returned from the api are in this order (regardless of number of events and continuation token).
See src/Core/Utilities/CoreHelpers.cs method DateTimeToTableStorageKey - used in src/Core/Repositories/TableStorage/EventRepository.cs to fetch the events from Bitwarden's Cloud DB.
The descending order is achieved, by changing how total milliseconds in Unix time format behaves when comparing two dates. Normally, the newer date will have more milliseconds passed compared to the older date. With this method, it's reverse (this solution was probably picked, instead of order by on the DB query, as the order by was not supported by the time in the Azure Cosmos DB .NET SDK, used for Bitwarden Cloud DB choice)

    private static readonly DateTime _max = new DateTime(9999, 1, 1, 0, 0, 0, DateTimeKind.Utc);

    public static string DateTimeToTableStorageKey(DateTime? date = null)
    {
        if (date.HasValue)
        {
            date = date.Value.ToUniversalTime();
        }
        else
        {
            date = DateTime.UtcNow;
        }

        return _max.Subtract(date.Value).TotalMilliseconds.ToString(CultureInfo.InvariantCulture);
    }

Examples:

  1. DateTimeToTableStorageKey(DateTime.Parse("2024-01-11T01:41:03.7354Z")) -> 251665827536264.6
  2. DateTimeToTableStorageKey(DateTime.Parse("2024-01-11T01:41:03.735438Z")) -> 251665827536264.56
  3. DateTimeToTableStorageKey(DateTime.Parse("2024-01-11T01:41:03.735439Z")) -> 251665827536264.56 (same as above)
  4. DateTimeToTableStorageKey(DateTime.Parse("2024-01-11T01:41:03.73544Z")) -> 251665827536264.56 (same as above)
  5. DateTimeToTableStorageKey(DateTime.Parse("2024-01-11T01:41:03.7355Z")) -> 251665827536264.5

Between 2024-01-11T01:41:03.7354Z and 2024-01-11T01:41:03.7355Z there are 3 different unique numbers, instead of 2 as you would expect by 0.0001 (4 digits precision) increase.

This means that:

  • With Option A, you get duplicates, namely 2nd, 3rd and 4th. Even though those date times are unique, they give the same query to db.
  • With Option B, you omitted the events, namely 2nd, 3rd and 4th. Increasing the 4th digit by 1 on the 1st event, results in 5th date.
    The answer for Option B is not increasing by 0.00005, here is why.

There is a precision loss, exactly at the point of .TotalMilliseconds execution.
This is because of the double floating precision type, used for TotalMilliseconds, which have a maximum precision of 15-17 digits total (depends). The 251665827536264.56 use maximum precision of double, but the actual number for 2024-01-11T01:41:03.735438Z should be 251665827536264.5620 - see _max.Subtract(date.Value).Ticks, which returns long type without precision loss.

Solution

Please document the current and expected behavior of how paging should work for this api. Currently it's not really known how to use this, what's the precision.
I suggest to document that intended behavior is to get duplicates and handle them on your own (Option A), better than inconsistent behavior of Option B

Fix

There are few options to fix this problem:

  1. Make the end date exclusive.
    • Benefits:
      • One liner change, the easiest
    • Cons:
      • Breaks backward compatibility of the end parameter.
      • For consistency have to change it for other DB implementations
  2. Force the precision to be 1 millisecond (just cut any digit after 4th). Do it everywhere, including the fetched event's dates from api.
    • Benefits:
      • Consistency
      • Easier to use - languages like Python only supports 6 second fraction digits.
      • One db implementation change for Azure Cosmos DB.
    • Cons:
      • A lot of places to change.
      • Breaks backward compatibility of the start and end parameters.
      • The event date storage with the DateTimeToTableStorageKey on the event will now have different precision. If the date is sorted in lexicographic order (string comparison), an ".0" might have to be added for backward compatibility (The 2024-01-11T15:19:14.034Z, results in 251665778445965.97, while the 2024-01-11T15:19:14.035Z, results in 251665778445965, might get the order wrong if the explicit .0 is missing)
  3. Fix the precision to be 7 digits - by using Decimal type (use the Ticks instead of TotalMilliseconds, save to decimal type, divide by 10000)
    • Benefits:
      • Backward compatibility
    • Cons:
      • Need to change it in Azure Cosmos DB for querying and creating event in db.

Build Version

2024.1.1

Environment

Cloud (bitwarden.com)

Environment Details

No response

Issue Tracking Info

  • I understand that work is tracked outside of Github. A PR will be linked to this issue should one be opened to address it, but Bitwarden doesn't use fields like "assigned", "milestone", or "project" to track progress.
@WaciX WaciX added the bug label Jan 21, 2024
@Greenderella
Copy link
Member

Hi there,

I am unable to reproduce this issue, it has been escalated for further investigation. If you have more information that can help us, please add it below.

Thanks!

@kspearrin
Copy link
Member

Thanks for the report @WaciX . I believe your analysis here is correct. We will evaluate some possible solutions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants