-
Notifications
You must be signed in to change notification settings - Fork 410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Best Practices #852
Comments
Bidirectional streaming is the right one if you want to implement a chat-app. This is because it fits how messages are delivered. But it keeps using the same backend once it's established, you may want to take a different approach if you have a specific needs on load-balancing. @chalin Could you take a look at ambiguities mentioned in the issue? @ejona86 Could you answer how gRPC can be used in this case? (assuming gRPC-Java is being used here) |
Why do you say that? I have to believe it is because of the paragraph starting:
However, it was saying you might have a performance issue and I don't see you providing any mention that you are having such a performance issue. Only use multiple channels when you can show it helps performance. (The specific times it helps get into lots of details on how you are load balancing, your networks, and other such details.)
That is a possible way channels work. That is what happens when using round_robin load balancing policy, and pretty much every load balancing policy except for pick_first. But it does depend on your deployment. It is rarer to need to "use multiple channels" when round_robin.
Nothing disagrees. Channels can have multiple connections and connections can have multiple streams. But it depends on your setup whether channels will have multiple connections and the limit of streams permitted on a connection.
It said nothing about connection. Maybe you are using the word a bit imprecisely, but when we speak of connections we would be talking about TCP connections. The "Use streaming RPCs" sums up clearly with "Use streams to optimize the application, not gRPC." That is the heart of the matter.
Streams in that context is just "streaming RPCs" essentially. It is saying "when to use the streaming feature."
This depends on your environment. The 100 RPC limit is seen pretty frequently on reverse proxies. But a backend can also set a limit of its own. A 100 RPC limit is common, but it isn't guaranteed. Backend-to-backend communication generally doesn't have that limit. It is really something specific to your deployment.
No. This is exactly the opposite of what the "Use streaming RPCs" section discussed. You aren't helping your application at all by this complexity. With just what you've described, I don't see anything wrong with using "normal" unary RPCs (no streaming).
The documentation was not referencing an internal class of a specific implementation. It is internal and that documentation is cross-language. |
@veblush, chalin isn't officially maintaining the site any more. He can still show up and help out, but we need to be more self-service these days. |
@ananda1066 can take a look instead. |
I agree with @ejona86. The tips are fairly general and will not apply to every case; we leave it up to the user to determine which practices improve performance for them and to tune these general practices to their case. This guide simply provides some general guidance to help the user determine the right trade-offs for their specific use case. We don't use stronger language or define exactly when to use unary vs streaming RPCs because this again depends on the use case; we just offer the general performance benefits and drawbacks of using streaming RPCs. |
Hi all, many thanks for the quick response. Do please close the ticket if you don't see any merit in the suggested doc change. Here's a response to some questions asked by ejona86
It was because of my incorrect interpretation of the word 'streams'.
My bad, I should have quoted "Use streaming RPCs when handling a long-lived logical flow of data" .. and my use case involves long-lived server to server communication of 1000's HTTP/2 DATA frames per sec.
I think this is the heart of the matter and your answer in the above sentence provides the information about the meaning of the word "streams" that I did not glean from the current doc. FYI I referenced an internal class just to show what I had interpreted the doc meant by a "stream". I really thought the doc was saying not to use multiple streams unless perf-tests show that you need to. This is why I opened the ticket to disambiguate between a 'streaming RPC API' and a 'GRPC stream'
I intentionally didn't provide details of my use case because, as highlighted by ananda1066 , I understand that the doc has
|
Since there are more servers than clients, using streaming will make it difficult to load balance as you've noticed. Given what you said before, it seems that there is no necessary ordering of requests, or any benefit to making sure certain requests go to the same backend of other requests. So unary is ideal in this situation.
Go ahead and file a bug in grpc-java repository and CC me. We'll discuss more there and I'll respond precisely. Many of the conclusions you have made are incorrect, but I can see how you may have arrived there. And MAX_CONCURRENT_STREAMS can maybe cause you to take alternative approaches. I don't see a strong reason for you to use streams and there are probably other workarounds, depending on the true source of issue. |
I should have included the fact that in my use case, each client
Many thanks for the offer to investigate further, but I've no reason to believe there is a gRPC bug (just a documentation improvement). According to the HTTP/2 RFC, the INTERNAL_ERROR is expected when a client exceeds the server limit. I have assumed that enhancement 21386 is just a more sophisticated implementation of what I am doing, where the client manages a 'pool of streams' using round-robin/least-loaded etc and is aware of its channel's MAX_CONCURRENT_STREAMS limit, opens a replacement stream when the old stream closes in error with back-off-throttling. But if you still think I should open a bug , can you give me a title so that I can fill in the details? Thanks again. |
Okay, then yeah, streaming looks much more appropriate. In Java it wouldn't have been too hard to make a Channel implementation that wraps 5-10 ManagedChannels and round-robins over them for each
But gRPC does not exceed the limit. Instead of exceeding the limit, gRPC queues locally. You can name the issue something like, "INTERNAL error when creating more streams that MAX_CONCURRENT_STREAMS", which is what you currently think is going on. The title doesn't matter too much though, as I can change it. Please provide the INTERNAL error message you see, as it has useful information. |
This explains why it took our C++ guy a bit longer to do the same wrapping of ManagedChannels :). Am very glad to hear that we are on track, in fact we already had a story in the backlog to 'limit the lifetime of the streams to re-balance over time' as the server side could scale up after the clients have established and they wouldn't get any work unless a new stream is created. This appears to be the most complex part, I should have added it to the list of items I'm expecting from enhancement 21386. Finally, I understand now your thinking around the bug, and will start preparing the data. Here's an example where I got an INTERNAL_ERROR and it was during IDLE (easier to study). I know now that we have an AWS load-balancer between our clients and servers and the LB that has a 'hard limit' of 128 MAX_CONCURRENT_STREAMS meaning that the AWS support cannot increase that number. The LB was closing idle streams every minute, so on the server we set .permitKeepAliveTime(10, SEC) and .permitKeepAliveWithoutCalls(true) and on the client we set .keepAliveTime(30, SEC) and .keepAliveWithoutCalls(true). When the system is idle and with 'io.grpc.netty' logging at DEBUG level(neat work whoever did that!) the client still sees this on one of its two LB IP addresses; A minute +123ms later, a stream to the other LB IP closes; This cycle continues indefinitely. Am thinking that we should resolve this issue before studying issues under traffic, will keep digging and create that bug ticket as required. Thanks again. |
The general section of https://grpc.io/docs/guides/performance/ led me to believe that to scale my application, I needed to use many gRPC channels and only one gRPC stream per channel between my long lived client and servers. This interpretation didn't match my understanding of a stream being a point-to-point path via 'sub-channels' from client to server and my understanding of a channel as a Y shaped collection of one sub-channel from client to each server. Nor does it match https://grpc.io/blog/grpc-on-http2/ which shows a single channel with many streams.
I now suspect that the author is recommending the use of a 'Unary RPC' API such as
over a 'Streaming RPC' API where the both MyRequests and MyResponse are streamed
I find this confusing because the paragraph is headed with "Use streaming RPCs" for long lived connections. My original interpretation was that I should use streaming RPCs" for long lived connections but avoid using more than one 'Stream' on my 'Streaming RPC'. I interpreted the word "Streams" (plural) to be more than one instance of class io.grpc.internal.Stream, such as that created from pseudo code
which to my knowledge
a) causes the client to use a new Stream (I see that class:io.grpc.internal.ClientCallIimpl method:startInternal invokes class:clientStreamProvider method:newStream)
and from my own recent reading elsewhere ...
b) the client must normally ensure that it does not create more than 100 'myStream' per channel.
I currently cache just one instance of myStream and invoke many times myStream.onNext(myRequest) but I now suspect that I may not need to use a Streaming RPC API at all, but that if I do, I should cache up to N instances of myStream ... where N is the number of servers and should be < 100 and that I should do my own load-balancing so that the load is shared over each myStream.
Am wondering therefore if the documentation should be updated to disambiguate between a 'streaming RPC API' and a GRPC stream such as io.grpc.internal.ClientStream. E.g. text "They also might increase performance " could be replaced with "Streaming RPC API also might increase performance".
Something like the following text would have been clearer to me;
Use keepalive pings ...(as is)
Use Unary RPCs for short or long lived connections unless the data based passed in each RPC is large or performance results show that Streaming RPCs are necessary to give the application more control over the underlying stream.
Use Streaming RPCs for long lived connections and the client is capable of creating multiple streams and providing its own load-balancing so that each Server is appropriately loaded. In future, the client will not longer need to do this ....
The text was updated successfully, but these errors were encountered: