Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NullPointerException if kafka producer has partitions that have no leader #114

Open
mouse256 opened this issue Oct 26, 2018 · 7 comments
Open
Milestone

Comments

@mouse256
Copy link

If kafka is in a unhealthy state and has partitions that currently don't have a leader, it will make the kafkaproducer crash with a NullPointerException.
I believe those partitions should be ignored in that case.

Stacktrace that happens in that case:
ERROR i.v.c.i.ContextImpl - Unhandled exception {} java.lang.NullPointerException: null at io.vertx.kafka.client.common.impl.Helper.from(Helper.java:87) ~[vertx-kafka-client-3.5.2.jar:3.5.2] at io.vertx.kafka.client.producer.impl.KafkaProducerImpl.lambda$partitionsFor$7(KafkaProducerImpl.java:165) ~[vertx-kafka-client-3.5.2.jar:3.5.2] at io.vertx.core.impl.FutureImpl.setHandler(FutureImpl.java:79) ~[vertx-core-3.5.3.jar:3.5.3] at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:289) ~[vertx-core-3.5.3.jar:3.5.3] at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:339) ~[vertx-core-3.5.3.jar:3.5.3] at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) [netty-common-4.1.19.Final.jar:4.1.19.Final] at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) [netty-common-4.1.19.Final.jar:4.1.19.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) [netty-transport-4.1.19.Final.jar:4.1.19.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:886) [netty-common-4.1.19.Final.jar:4.1.19.Final] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.19.Final.jar:4.1.19.Final]

@ppatierno
Copy link
Member

Thanks for reporting, I'll look into this.

@anhldbk
Copy link
Contributor

anhldbk commented Nov 19, 2018

@ppatierno Following this �post, I think we should have checks in the loop

for (org.apache.kafka.common.PartitionInfo kafkaPartitionInfo: done.result()) {

	PartitionInfo partitionInfo = new PartitionInfo();

	partitionInfo
	.setInSyncReplicas(
		Stream.of(kafkaPartitionInfo.inSyncReplicas()).map(Helper::from).collect(Collectors.toList()))
	.setLeader(Helper.from(kafkaPartitionInfo.leader())) // -> WE MAY NEED TO CHECK IF THIS VALUE IS NULL
	.setPartition(kafkaPartitionInfo.partition())
	.setReplicas(
		Stream.of(kafkaPartitionInfo.replicas()).map(Helper::from).collect(Collectors.toList()))
	.setTopic(kafkaPartitionInfo.topic());

	partitions.add(partitionInfo);
}
	handler.handle(Future.succeededFuture(partitions));
} else {
	handler.handle(Future.failedFuture(done.cause()));
}

We also need to fix that logic in KafkaConsumerImpl.

Should I make a PR with the approach above?

@barbarosalp
Copy link

barbarosalp commented Feb 14, 2019

For my case done.result() throws NullPointerException when I call rxPartitions for a non-existing topic with auto.create.topics.enable=false

I think it should be caught and Future.failedFuture(ex.getCause()) should be called with the error.

@b1zzu
Copy link

b1zzu commented Feb 16, 2022

Hi, I'm having the same issue and I also got to the conclusion that is the done.result() that returns null and therefore the loop throws a java.lang.NullPointerException and the handler never get notified of this error.

My traceback:

2022-02-15 14:14:11 ERROR [ContextImpl:] Unhandled exception
java.lang.NullPointerException: null
	at io.vertx.kafka.client.consumer.impl.KafkaConsumerImpl.lambda$partitionsFor$8(KafkaConsumerImpl.java:466) ~[vertx-kafka-client-4.2.4.jar:4.2.4]
	at io.vertx.kafka.client.consumer.impl.KafkaReadStreamImpl.lambda$null$1(KafkaReadStreamImpl.java:130) ~[vertx-kafka-client-4.2.4.jar:4.2.4]
	at io.vertx.core.impl.AbstractContext.dispatch(AbstractContext.java:100) ~[vertx-core-4.2.4.jar:4.2.4]
	at io.vertx.core.impl.AbstractContext.dispatch(AbstractContext.java:63) ~[vertx-core-4.2.4.jar:4.2.4]
	at io.vertx.core.impl.EventLoopContext.lambda$runOnContext$0(EventLoopContext.java:38) ~[vertx-core-4.2.4.jar:4.2.4]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) [netty-common-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469) [netty-common-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:503) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) [netty-common-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.73.Final.jar:4.1.73.Final]
	at java.lang.Thread.run(Thread.java:829) [?:?]

@vietj vietj added this to the 4.2.5 milestone Feb 16, 2022
@vietj
Copy link
Contributor

vietj commented Feb 16, 2022

ping @ppatierno

@vietj vietj modified the milestones: 4.2.5, 4.2.6 Feb 16, 2022
@ppatierno
Copy link
Member

I have to find the time to finally take a look at this issue :-(

@vietj vietj modified the milestones: 4.3.2, 4.3.3 Jul 6, 2022
@vietj vietj modified the milestones: 4.3.3, 4.3.4 Aug 9, 2022
@vietj vietj modified the milestones: 4.3.4, 4.3.5 Oct 1, 2022
@vietj vietj modified the milestones: 4.3.5, 4.4.0 Nov 18, 2022
@vietj vietj modified the milestones: 4.4.0, 4.4.1 Mar 2, 2023
@aesteve
Copy link
Contributor

aesteve commented Mar 18, 2023

Has anyone managed to reproduce this issue?

I tried a few things like producer.write(...) in a topic that does not exist together with auto.create.topics.enable set to false. All I get is a failed future (therefore the .result() is null, but that's expected) but no crash unfortunately.

I understand the fix proposed by @anhldbk (and it looks like the right one) and we could implement it safely but I'd like to reproduce the issue first.

In fact, if the infamous "metadata is not present after ..." error happens, it seems to be handled well.

And I can't find a way to have the Kafka standard client return a null List. It seems it'll either return an empty List or throw an exception, even if the cluster is in an unhealthy state.

Which lets me think that yes, we potentially could add a band-aid to check for nullity, but the problem might be buried a bit deeper.

@vietj vietj modified the milestones: 4.4.1, 4.4.2 Mar 31, 2023
@vietj vietj modified the milestones: 4.4.2, 4.4.3 May 12, 2023
@vietj vietj modified the milestones: 4.4.3, 4.4.4-SNAPSHOT, 4.4.4 Jun 7, 2023
@vietj vietj modified the milestones: 4.4.4, 4.4.5 Jun 22, 2023
@vietj vietj modified the milestones: 4.4.5, 4.4.6 Aug 30, 2023
@vietj vietj modified the milestones: 4.4.6, 4.5.0 Sep 12, 2023
@vietj vietj modified the milestones: 4.5.0, 4.5.1 Nov 15, 2023
@vietj vietj modified the milestones: 4.5.1, 4.5.2 Dec 13, 2023
@vietj vietj modified the milestones: 4.5.2, 4.5.3 Jan 30, 2024
@vietj vietj modified the milestones: 4.5.3, 4.5.4 Feb 6, 2024
@vietj vietj modified the milestones: 4.5.4, 4.5.5 Feb 22, 2024
@vietj vietj modified the milestones: 4.5.5, 4.5.6 Mar 14, 2024
@vietj vietj modified the milestones: 4.5.6, 4.5.7, 4.5.8 Mar 21, 2024
@vietj vietj modified the milestones: 4.5.8, 4.5.9 May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

7 participants