bug: crons with concurrency limits set cause the engine to crash #483

trisongz · 2024-05-10T15:40:27Z

I have hatchet self-hosted in a K8s cluster.

Container Images:

engine: ghcr.io/hatchet-dev/hatchet/hatchet-engine:v0.26.1
api: ghcr.io/hatchet-dev/hatchet/hatchet-api:v0.26.1
rabbitmq: docker.io/bitnami/rabbitmq:3.13.2-debian-12-r0

SDK: Python - hatchet-sdk-0.23.0 (0.22.5 prior)

After version 0.23.0, I've consistently run into the following issue when a cron task gets triggered, which then causes a reboot loop on the engine container:

2024-05-10T15:34:53.555Z INF workflow 491b44e5-34ad-4764-847b-fedf8f838362 has concurrency settings service=workflows-controller
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x78 pc=0x14053cf]

goroutine 41 [running]:
github.com/hatchet-dev/hatchet/internal/services/controllers/workflows.(*WorkflowsControllerImpl).scheduleGetGroupAction(0xc00372ef50, {0x1b11638?, 0xc000751320?}, 0x0)
	/hatchet/internal/services/controllers/workflows/queue.go:211 +0xcf
github.com/hatchet-dev/hatchet/internal/services/controllers/workflows.(*WorkflowsControllerImpl).handleWorkflowRunQueued(0xc00372ef50, {0x1b11590?, 0x3d2f720?}, 0xc000750900)
	/hatchet/internal/services/controllers/workflows/queue.go:72 +0x618
github.com/hatchet-dev/hatchet/internal/services/controllers/workflows.(*WorkflowsControllerImpl).handleTask(0xc0004b3ad0?, {0x1b11590, 0x3d2f720}, 0xc000750900)
	/hatchet/internal/services/controllers/workflows/controller.go:199 +0x117
github.com/hatchet-dev/hatchet/internal/services/controllers/workflows.(*WorkflowsControllerImpl).Start.func1(0xc0007ba000?)
	/hatchet/internal/services/controllers/workflows/controller.go:161 +0x90
github.com/hatchet-dev/hatchet/internal/msgqueue/rabbitmq.(*MessageQueueImpl).subscribe.func1.2({{0x1b0f940, 0xc00062c7e0}, 0x0, {0x0, 0x0}, {0x0, 0x0}, 0x0, 0x0, {0x0, ...}, ...})
	/hatchet/internal/msgqueue/rabbitmq/rabbitmq.go:502 +0x88b
created by github.com/hatchet-dev/hatchet/internal/msgqueue/rabbitmq.(*MessageQueueImpl).subscribe.func1 in goroutine 154
	/hatchet/internal/msgqueue/rabbitmq/rabbitmq.go:451 +0x5c6

I've attempted the following to debug:

Delete all workflows, which deletes the cron schedules and allows the engine container to get back up.
Recreate rabbitmq, including the persistent data, which doesn't do anything.

I am able to trigger the workflow manually, but whenever the cron schedule triggers the workflow, that issue occurs.

The text was updated successfully, but these errors were encountered:

abelanger5 · 2024-05-10T16:55:30Z

Hey @trisongz, thanks for the report - I'll be taking a look at this today. This looks like an issue with the workflow run not being created properly from cron workflows if you have a concurrency limit setting on the workflow run. This isn't an issue with RabbitMQ, so no need to restart things on that side (the methods are just being triggered by a RabbitMQ message).

trisongz · 2024-05-10T18:20:43Z

Thanks for the response, I was able to confirm that after removing concurrency from it, that the latest version works.

abelanger5 · 2024-05-13T20:16:33Z

This is fixed in v0.26.2

abelanger5 mentioned this issue May 11, 2024

fix: handle nil input more gracefully #486

Merged

1 task

abelanger5 changed the title ~~bug: rabbitmq subscribe issue with hatchet-engine > 0.23.0~~ bug: crons with concurrency limits set cause the engine to crash May 11, 2024

abelanger5 closed this as completed in #486 May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: crons with concurrency limits set cause the engine to crash #483

bug: crons with concurrency limits set cause the engine to crash #483

trisongz commented May 10, 2024

abelanger5 commented May 10, 2024

trisongz commented May 10, 2024

abelanger5 commented May 13, 2024

bug: crons with concurrency limits set cause the engine to crash #483

bug: crons with concurrency limits set cause the engine to crash #483

Comments

trisongz commented May 10, 2024

abelanger5 commented May 10, 2024

trisongz commented May 10, 2024

abelanger5 commented May 13, 2024