Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug A lot of errors in log since HA 2024.4.0 causes high CPU usage and crash HA #2023

Open
scharrrfi opened this issue Apr 7, 2024 · 16 comments
Labels
Bug Identifies an issue where the system is not functioning as expected. Performance Performance related issues.

Comments

@scharrrfi
Copy link

scharrrfi commented Apr 7, 2024

TFT Version

4.3.2

ESPHome Version

4.3.2

Blueprint Version

4.3.2

Panel Model

US

What is the bug?

Hi there,

since I have updated to HA 2024.4.0 (at the moment at 2024.4.1) I have a lot of error messages from the NSPanel blueprint in my log for both of my NSPanels:

Logger: homeassistant.components.automation.nspanel_esszimmer
Quelle: components/automation/__init__.py:726
Integration: Automatisierung (Dokumentation, Probleme)
Erstmals aufgetreten: 11:41:54 (2 Vorkommnisse)
Zuletzt protokolliert: 11:44:51

While executing automation automation.nspanel_esszimmer
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/automation/__init__.py", line 726, in async_trigger
    return await self.action_script.async_run(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 1650, in async_run
    return await asyncio.shield(create_eager_task(run.async_run()))
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/util/async_.py", line 35, in create_eager_task
    loop=loop or get_running_loop(),
                 ^^^^^^^^^^^^^^^^^^
RuntimeError: no running event loop

I think this is related to home-assistant/core#114849

After a while my HA instance has then high CPU load and do not respond / crashes completely. I can also not stop the core over SSH. I have to do a complete reboot of the machine.
This behavor is only then the NSPanel blueprint is activated. When I deactivate it, my system is stable.

With HA 2024.3.x ist was also stable with no error messages.

@scharrrfi scharrrfi added the Bug Identifies an issue where the system is not functioning as expected. label Apr 7, 2024
@edwardtfn edwardtfn added the Performance Performance related issues. label Apr 7, 2024
@scharrrfi
Copy link
Author

scharrrfi commented Apr 7, 2024

Update: I have done some futher investigations with async debug mode. It seems, that the problem come from a entity of the Homematic(IP) Local Integration, which I have integraded to the blueprint for the outside temperature. After remove the temperature entity from the blueprint, HA seems to be stable again without errors in the log.

I think, the blueprint try to update the temperature too often, so that the bug in the custom integration causes to crash HA.

@scharrrfi
Copy link
Author

Now I am more confused.
For more investigation I have added the temperature entitiy back to the NSPanel blueprint with activated async-debug-mode. No errors. Only warnings from the Homematic Integration like here: danielperna84/hahomematic#1483

When I deactivate the async-debug-mode, I got the above shown errors again. When I remove the temperature entity from the blueprint, it runs finde again...

@edwardtfn
Copy link
Collaborator

I get a feeling that something from HA side is making things take more CPU than normal, so your system is probably surviving to a combination of high CPU consumers, but it's probably at the limit, so any small change could cause a crash.
The Blueprint shouldn't be that much CPU intensive, but I've also noticed some increase in CPU coming from there and have even worked in some improvements for v4.3.2. Still some space for improvement, but probably will require some work (an time) on rebuilding the architecture.

@scharrrfi
Copy link
Author

scharrrfi commented Apr 8, 2024

I think not, that this is a performance problem.

My HA is a proxmox virtual machine on a 8 Core i7 10th with 32 GB Ram. At the moment it can use 4 cores and 8 GB ram. That is more than enough for the system. Other people run them on a raspberry pi or slower system without performance issues.

Before version 2024.4.0 it runs smoothly for years and has no issues at all. The CPU load from the core system was normally always under 10 %.

I think there is an essential bug with some integration which use async lib of python not correct. This causes endless loops, so the system crashs sooner or later. But I am not sure, why this problem comes with the new HA version for first time.

@future159
Copy link

I can confirm the problems since the update to 4.3.2. I also use the Homematic(IP) Local Integration for an outside temperature sensor wich is integraded in the blueprint.
However, I have not yet investigated the connection with this further.

@edwardtfn edwardtfn added this to the v4.3.3 - Patch milestone Apr 8, 2024
@edwardtfn
Copy link
Collaborator

Could you please update your Home Assistant to v2024.4.2 and let me know if this issue persists?

@scharrrfi
Copy link
Author

scharrrfi commented Apr 8, 2024

No it doesn't persist!

But I am not sure, if the new HA version or this commit the reason: danielperna84/hahomematic#1483 I switched before the update to the beta of the Homematic integration.

But now the systems seems to be stable again. Thank you for your support anyway! :)

@future159
Copy link

future159 commented Apr 8, 2024 via email

@future159
Copy link

future159 commented Apr 8, 2024 via email

@future159
Copy link

future159 commented Apr 8, 2024 via email

@scharrrfi
Copy link
Author

Try to use beta of the Homematic(IP) Local Integration 1.59.0b0

This should solve the problem.

@dylanpedro
Copy link

I'm still having the issue and had to disable the blueprints for the system to function. I also use a local sensor for the temperature (sonoff via zigbee2mqtt).

Not sure what else to try!

@edwardtfn
Copy link
Collaborator

You can try to find the root cause of this like in this case: nielsfaber/alarmo#920 (comment)

@future159
Copy link

Updated to HM(IP) Local 1.59.0b0. Everything is fine for the moment.

@edwardtfn
Copy link
Collaborator

Ok, looks like this is an issue caused by other integrations.

@edwardtfn edwardtfn removed this from the v4.3.3 - Patch milestone Apr 9, 2024
@JMoratelli
Copy link

Here is the solution.

nielsfaber/alarmo#920 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Identifies an issue where the system is not functioning as expected. Performance Performance related issues.
Projects
None yet
Development

No branches or pull requests

5 participants