-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SortedSet.IsSubsetOf not working as expected #102118
Comments
The problem seems to come from the assumption that underlying tree is always balanced. Minimal repro (for me, .NET 8): HashSet<int> h = new();
SortedSet<int> ss = new();
var num = 18;
for (int i = 0; i < num; i++)
{
h.Add(i);
ss.Add(i);
}
var data = Enumerable.Range(0, num).Append(17);
var hashSetEquals = h.SetEquals(data);
var sortedSetEquals = ss.SetEquals(data);
Console.WriteLine(hashSetEquals);
Console.WriteLine(sortedSetEquals); Demo @sharplab.io |
From my analysis @Stackoverflow (please correct me if I got something wrong): As far as I can see simple element count is used to determine the size for int originalLastIndex = Count;
int intArrayLength = BitHelper.ToIntArrayLength(originalLastIndex);
BitHelper bitHelper = intArrayLength <= StackAllocThreshold ?
new BitHelper(span.Slice(0, intArrayLength), clear: true) :
new BitHelper(new int[intArrayLength], clear: false); But index of the found item is determined the following way: internal virtual int InternalIndexOf(T item)
{
Node? current = root;
int count = 0;
while (current != null)
{
int order = comparer.Compare(item, current.Item);
if (order == 0)
{
return count;
}
current = order < 0 ? current.Left : current.Right;
count = order < 0 ? (2 * count + 1) : (2 * count + 2);
}
return -1;
} Which "overflows" the calculated
Which lead to this code always marking and counting duplicates: int index = InternalIndexOf(item);
if (index >= 0)
{
if (!bitHelper.IsMarked(index))
{
if (hashSet.ContainsKey(item))
{
Console.WriteLine($"Here {item}");
}
else
{
hashSet.Add(item, index);
}
// item hasn't been seen yet
bitHelper.MarkBit(index);
uniqueFoundCount++;
}
} |
Same on .NET Framework? |
@danmoseley seems so: https://dotnetfiddle.net/zXkZuS Also it seems that this happens not only in the "unbalanced" case. If number of elements is divisible by 32 (the bit size of HashSet<int> h = new HashSet<int>();
SortedSet<int> ss = new SortedSet<int>();
var num = 32;
for (int i = 0; i < num; i++)
{
h.Add(i);
ss.Add(i);
}
var data = Enumerable.Repeat(0, 2).SelectMany(i => Enumerable.Range(0, num));
var hashSetEquals = h.SetEquals(data);
var sortedSetEquals = ss.SetEquals(data);
Console.WriteLine(hashSetEquals);
Console.WriteLine(sortedSetEquals);
// balance set
ss = new SortedSet<int>(h.ToList()); // ss = new SortedSet<int>(ss.ToList());
sortedSetEquals = ss.SetEquals(data);
Console.WriteLine("Balanced SortedSet:");
Console.WriteLine(sortedSetEquals); // FALSE too https://dotnetfiddle.net/x9YQHT Also it seems that for all collection sizes bigger than 64 it happens also. Checked several of them: |
By the way, the following property-based test catches this instantly: open FsCheck
open System.Collections.Generic
open NUnit.Framework
[<TestFixture>]
module Foo =
[<Test>]
let ``SortedSet test, positive case`` () =
let property (sub : byte list) (super : byte list) =
let super = sub @ super // or `super @ sub`; any standard way to construct a superset will demonstrate the problem, as will randomising this list
(SortedSet sub).IsSubsetOf super
|> (=) ((Set.ofList sub).IsSubsetOf (Set.ofList super))
Check.QuickThrowOnFailure property There's already a usage of FsCheck in the .NET runtime tests; it would be very little effort to add dramatically more coverage to the existing tests by converting them to property-based tests. |
Description
I don't believe that
SortedSet.IsSubsetOf
is working as expected. Given this code, shouldn'tss.IsSubsetOf(l)
return True?I'm suspicious that the problem is in my
CompareTo
function, but I can't see the problem.I also added a separate loop thinking that the set had to be found sorted inside the
other
IEnumerable . . . but that doesn't work either.Can anyone explain why this behavior is happening?
Reproduction Steps
Expected behavior
ss.IsSubsetOf(l)
should be TrueActual behavior
ss.IsSubsetOf(l))
is FalseRegression?
No response
Known Workarounds
If I remove the second
l.Add(p)
, it works as expected.Configuration
Windows 11, .NET 8, x64
$ dotnet --list-sdks
8.0.204 [C:\Program Files\dotnet\sdk]
Other information
No response
The text was updated successfully, but these errors were encountered: