New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Runbooks #343
Comments
I fully agree 💯 How can i help you? Would you like to start writing runbooks? In the short term, I can adapt the site. The example you provided looks good. I think we can write it in markdown. |
Yes, i can help to write runbooks, in parallel maybe you can start creating the page for this runbooks 👍 |
Hi Folks,
I think it would be nice if we have runbooks which contain outlines the procedures to be followed when an alert is triggered in a monitoring system. It provides step-by-step instructions for identifying the cause of the alert, assessing its impact, and implementing a solution to resolve the issue.
If it okay, i would very happy to make a contribution.
For Example:
HostHighCpuLoad
Meaning
The "HostHighCpuLoad" alert is triggered when the CPU load on a host exceeds a defined threshold. This alert is designed to detect performance issues and potential system instability related to high CPU utilization.
Impact
If this alert is not properly addressed, it may result in degraded performance, system crashes, and potential service disruptions.
Diagnosis
Check the system load average using the following command:
The output will show the current system load average for the past 1, 5, and 15 minutes. If the load average is consistently higher than the number of CPU cores on the system, it indicates that the system is experiencing high CPU load.
Identify which processes are using the most CPU resources by running the following command:
The output will show the top CPU-consuming processes. Identify any processes that are consuming a significant amount of CPU resources and investigate further.
Check for any system configuration issues and/or update that may be causing high CPU load. Look for any misconfigured services or applications that are running on the system and causing excessive CPU usage.
Check system logs for any error messages related to high CPU usage. Look for any system errors or warnings that may indicate a problem with the system's CPU usage.
Mitigations
To mitigate this alert and address the performance and stability issues related to high CPU load, the following steps can be taken:
The text was updated successfully, but these errors were encountered: