Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds UnsafeCollectionOperations for unsafe access to RepeatedField<T> #16772

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

PascalSenn
Copy link

@PascalSenn PascalSenn commented May 7, 2024

This is a proposal to add UnsafeCollectionOperations for fast access on RepeatedField<T>
#16745

@PascalSenn PascalSenn requested a review from a team as a code owner May 7, 2024 17:57
@PascalSenn PascalSenn requested review from jskeet and removed request for a team May 7, 2024 17:57
Copy link

google-cla bot commented May 7, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@jskeet jskeet changed the title Adds ReadOnlySpan<T> and ReadOnlyMemory<T> access on RepeatedField<T>` Adds ReadOnlySpan<T> access on RepeatedField<T> May 10, 2024
@jskeet
Copy link
Contributor

jskeet commented May 10, 2024

@JamesNK We're aware that if the repeated field is modified while the span is "active", it may or may not reflect the changes (e.g. add/remove operations will not be seen in terms of the length, and if the array is reallocated any further updates won't be visible at all) - but do you think that's reasonable behavior, so long as we document that appropriately? (I'd just say that the state of the span is undefined after any modifications.) Any other thoughts about this PR?

@JamesNK
Copy link
Contributor

JamesNK commented May 10, 2024

Modification is the problem. Sometimes the span reflects changes, sometimes it doesn't.

I like the idea, but I don't think it should be a property on RepeatedField<T>. It makes the dangerous behavior too easy to access. I suggest doing what the BCL does with List<T> + span and make it a static method on an "advanced" type.

https://learn.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.collectionsmarshal.asspan?view=net-8.0

A property is easily discoverable with autocomplete. On the other hand, a method on a type is something people have to go looking to use, and they'll hopefully read the docs about why they should (or shouldn't) use the method.

foreach (var product in CollectionsMarshal.AsSpan(response.Products))
{
    // do stuff
}

I took a look at RepeatedField<T> source code and setting values does nothing more than add the value to the array (with various safety checks). The AsSpan method could return a Span<T> so it can also be used as a fast way to modify the repeated field as well.

var request = new Request();
CollectionsMarshal.SetCount(request.Products, count: 1000);
var productsSpan = CollectionsMarshal.AsSpan(request.Products);

for (var i = 0; i < productsSpan.Length; i++)
{
    productsSpan[i] = // do stuff
}

This would save many internal array allocations and copies as the repeated field grows.

However, SetCount (a new version for repeated fields) + modifying the span directly wouldn't guarantee the values in the repeated field aren't null.

@PascalSenn
Copy link
Author

PascalSenn commented May 12, 2024

@JamesNK @jskeet

I changed the implementation to include your suggestions. There is already a similar class UnsafeByteOperations, so I aligned the naming with it -> UnsafeCollectionOperations. The implementation of the class is very close to the CollectionsMarshal.cs from System.Runtime.InteropServices. I implemented SetCount and AsSpan. I also ported the tests over from CollectionsMarshalTests.cs.

@PascalSenn PascalSenn changed the title Adds ReadOnlySpan<T> access on RepeatedField<T> Adds UnsafeCollectionOperations for unsafe access to RepeatedField<T> May 12, 2024
Copy link
Contributor

@jskeet jskeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed the tests in detail as I'm suggesting quite significant implementation changes.

@PascalSenn
Copy link
Author

@jskeet
Most of this implementation closely aligned with CollectionsMarshal.cs from the runtime. The goal was to keep the implementation and behavior relatively consistent.

I also believe a few things were performance optimizations in the CollectionsMarshal. For example, the aggressive inlining and the direct field access.

I incorporated your feedback into this pull request to better align with the behavior of Protobuf. I left a couple of comments open to provide some context. Feel free to close them as needed after review, as everything should be resolved.

@PascalSenn
Copy link
Author

@jskeet
Copy link
Contributor

jskeet commented May 13, 2024

I'll have a look tomorrow. I would say that I'm not as interested in alignment with CollectionsMarshal as with consistency with the rest of this codebase, and clarity of code and usage.

Copy link
Contributor

@JamesNK JamesNK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its good! Some minor comment and test improvements in feedback

@@ -1,7 +1,7 @@
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<TargetFrameworks>net462;net6.0</TargetFrameworks>
Copy link
Contributor

@JamesNK JamesNK May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come net462 is removed? I think it's important to run tests on .NET Framework. If some features aren't supported on .NET Framework then they can be selectively #ifdefed out.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. I removed it to be able to run the test on my unix machine

csharp/src/Google.Protobuf/UnsafeCollectionOperations.cs Outdated Show resolved Hide resolved
Copy link
Contributor

@jskeet jskeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Down to just nits now - it feels like we're very close, thanks very much for this!

@jskeet
Copy link
Contributor

jskeet commented May 15, 2024

cc @amanda-tarafa for any additional thoughts

Copy link

@amanda-tarafa amanda-tarafa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of comments, but looks good otherwise.

@PascalSenn
Copy link
Author

@JamesNK @jskeet @amanda-tarafa
I think i resolved all the suggestions. Feel free to have another look

@jskeet
Copy link
Contributor

jskeet commented May 17, 2024

Thanks - I'm afraid at this point I won't get to this until Monday, but hopefully I'll be able to look then.

Copy link
Contributor

@jskeet jskeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just down to comment nits from my side.

@PascalSenn
Copy link
Author

@jskeet Comments are resolved 🚀

@jskeet jskeet added the 🅰️ safe for tests Mark a commit as safe to run presubmits over label May 20, 2024
@github-actions github-actions bot removed the 🅰️ safe for tests Mark a commit as safe to run presubmits over label May 20, 2024
@jskeet
Copy link
Contributor

jskeet commented May 20, 2024

@JamesNK: Could you take one final pass at this? It looks good to me, but due to the unsafe aspects of it, I'd appreciate a double-check. After that, I can get it reviewed by a member of the protobuf team and merged.

@jskeet
Copy link
Contributor

jskeet commented May 22, 2024

@mkruskal-google: We've reviewed this from a .NET perspective. Let me know if you'd like any more background. (But basically this allows very efficient access to a repeated field, with caveats around anything that might resize it.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants