Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-15541: Add num-open-iterators metric #15975

Merged
merged 3 commits into from
May 22, 2024
Merged

Conversation

nicktelford
Copy link
Contributor

Part of KIP-989.

This new StateStore metric tracks the number of Iterator instances that have been created, but not yet closed (via
AutoCloseable#close()).

This will aid users in detecting leaked iterators, which can cause major performance problems.

Part of [KIP-989](https://cwiki.apache.org/confluence/x/9KCzDw).

This new `StateStore` metric tracks the number of `Iterator` instances
that have been created, but not yet closed (via
`AutoCloseable#close()`).

This will aid users in detecting leaked iterators, which can cause major
performance problems.
@nicktelford
Copy link
Contributor Author

@mjsax @lucasbru @ableegoldman I know the vote hasn't closed yet, but I thought I'd get a head-start on the review. I'm submitting each metric from KIP-989 as a separate PR, to aid review, and because they can be merged independently.

I think I've covered every Metered Iterator created; I searched for Metered**Store and found every Iterator constructed by them. Are there any that I've missed?

Note: I'm not completely sold on the test strategy here. I'm currently calling one Iterator-creating method on each Metered store type (e.g. all() on MeteredKeyValueStore) - is this enough, or should I have a test for each Iterator-creating method each store provides? Might be somewhat overkill.

Like all the other Iterators, we need to ensure we decrement our counter
even if an exception is thrown while closing the underlying Iterator.
@mjsax mjsax added streams kip Requires or implements a KIP labels May 16, 2024
Copy link
Member

@lucasbru lucasbru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Others know the state store hierarchy better than I do, so maybe it would be good to get another review. But I couldn't find anything wrong with this change, so LGTM, thanks!

@@ -162,6 +165,8 @@ private void registerMetrics() {
flushSensor = StateStoreMetrics.flushSensor(taskId.toString(), metricsScope, name(), streamsMetrics);
deleteSensor = StateStoreMetrics.deleteSensor(taskId.toString(), metricsScope, name(), streamsMetrics);
e2eLatencySensor = StateStoreMetrics.e2ELatencySensor(taskId.toString(), metricsScope, name(), streamsMetrics);
StateStoreMetrics.addNumOpenIteratorsGauge(taskId.toString(), metricsScope, name(), streamsMetrics,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to add a Sensor that allows us to track the different metrics in one go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can Sensors track Gauges? I don't believe that they can.

…/metrics/StateStoreMetrics.java


Use lowercase for "iterators" in description

Co-authored-by: Matthias J. Sax <[email protected]>
@mjsax mjsax merged commit 5552f5c into apache:trunk May 22, 2024
1 check failed
@mjsax
Copy link
Member

mjsax commented May 22, 2024

Merged to trunk to make progress on the implementation of the KIP.

About testing: I agree that we might not need to test every method which creates an iterator, but it would be great to test one method per "iterator type", per store? Ie, if there are two constructors for two iterator classed, we should try to test both by having a test for one method which create the one, or other iterator?

About Sensor: you are right, that it currently does not support Gauge, but I am wondering if we could not just add it? The problem I see it, that Sensor seems to be a public API, so we would need to include it in the KIP to change it... (might be better to do this a follow up cleanup if we think it's worth doing?)

@nicktelford
Copy link
Contributor Author

About Sensor: you are right, that it currently does not support Gauge, but I am wondering if we could not just add it? The problem I see it, that Sensor seems to be a public API, so we would need to include it in the KIP to change it... (might be better to do this a follow up cleanup if we think it's worth doing?)

@mjsax Since Sensor is part of the kafka-common package, it would have a much broader scope than just kafka-streams. For that reason, I think it should be a separate KIP, since the scope of KIP-989 is limited to just Streams.

@mjsax
Copy link
Member

mjsax commented May 23, 2024

Makes sense. Could you file a Jira for the KIP and the follow up cleanup for KS code to use it :)

rreddy-22 pushed a commit to rreddy-22/kafka-rreddy that referenced this pull request May 24, 2024
Part of [KIP-989](https://cwiki.apache.org/confluence/x/9KCzDw).

This new `StateStore` metric tracks the number of `Iterator` instances
that have been created, but not yet closed (via `AutoCloseable#close()`).

This will aid users in detecting leaked iterators, which can cause major
performance problems.

Reviewers: Lucas Brutschy <[email protected]>, Matthias J. Sax <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kip Requires or implements a KIP streams
Projects
None yet
3 participants