Investigate cluster CPU spikes #727

ejsmith · 2020-09-27T17:08:02Z

Happening every 8 hours. Orphan data cleanup is the only thing that happens every 8 hours. Doesn't look like the job itself is taking a lot of CPU, but ES CPU spikes during that time. So my guess would be whatever we are doing in orphan data cleanup is very expensive in ES.

ejsmith · 2020-09-27T17:12:35Z

The job is searching for orphaned documents across all time. Since we are running this on a regular basis, we can probably restrict it to only check events within the last 3 days or so (forgot how far back we allow events to be post dated).

ejsmith · 2020-09-27T18:11:34Z

I guess what would be better is to find recently deleted stacks, projects, orgs and see if there is any matching events. Ideally we wouldn't have to worry about this happening. We need to bullet proof the process of deleting projects, stacks and orgs and make sure it's impossible for for them to be deleted if there are any matching events. We might have a concurrency issue where a stack is deleted but a new event comes in at the same time and gets added to that stack even though it was just deleted.

niemyjski · 2020-09-28T11:24:55Z

@ejsmith We only allow event submission for the past 3 days. Maybe we only run the full check once a month or once a week?

niemyjski added Hacktoberfest investigating labels Sep 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate cluster CPU spikes #727

Investigate cluster CPU spikes #727

ejsmith commented Sep 27, 2020

ejsmith commented Sep 27, 2020

ejsmith commented Sep 27, 2020

niemyjski commented Sep 28, 2020

Investigate cluster CPU spikes #727

Investigate cluster CPU spikes #727

Comments

ejsmith commented Sep 27, 2020

ejsmith commented Sep 27, 2020

ejsmith commented Sep 27, 2020

niemyjski commented Sep 28, 2020