Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overriding monitoring rules #527

Open
Virsacer opened this issue Jun 9, 2023 · 12 comments
Open

Overriding monitoring rules #527

Virsacer opened this issue Jun 9, 2023 · 12 comments

Comments

@Virsacer
Copy link
Contributor

Virsacer commented Jun 9, 2023

Expected Behavior

When having "Global Monitoring Rules" and "Folder Monitoring Rules" with overlapping settings, the latter should "win".

Current Behavior

I have set "When powered off" to "Do nothing" for VMs globally.
But for one folder I have set it to "Trigger a Critical state".
Unfortunately the global setting is used for VMs in that folder:

  [OK] Power State
   \_ [OK] Virtual Machine has been powered off

Your Environment

  • VMware vCenter®/ESXi™-Version: 7.0.3
  • Version/GIT-Hash of this module: 1.7.1
  • Icinga Web 2 version: 2.11.4
  • Operating System and version: Oracle Linux 7
  • Webserver, PHP versions: PHP 7.3.33
@Thomas-Gelf
Copy link
Contributor

Please give a look to --inspect, does it reflect what you're seeing?

@Virsacer
Copy link
Contributor Author

Virsacer commented Jun 9, 2023

Hi, I did not know that parameter...

  [OK] Power State (--rule ObjectStatePolicy/PowerState)
   trigger_on_poweredOff = "ignore"
   trigger_on_suspended = "critical" (inherited from AlwaysOnFolder)
   trigger_on_unknown = "critical" (inherited from AlwaysOnFolder)
   \_ [OK] Virtual Machine has been powered off

So "trigger_on_poweredOff" comes from Global, but should be overwritten from "AlwaysOnFolder"

@Thomas-Gelf
Copy link
Contributor

mysql --binary-as-hex vspheredb -e 'SELECT * FROM monitoring_rule_set\G'

mysql --binary-as-hex vspheredb -e "SELECT * FROM object WHERE object_name = 'AlwaysOnFolder'\G"

@Virsacer
Copy link
Contributor Author

Virsacer commented Jun 9, 2023

*************************** 1. row ***************************
  object_uuid: 0x
object_folder: vm
     settings: {"ObjectStatePolicy/PowerState/critical_for_uptime_greater_than_days":999,"ObjectStatePolicy/PowerState/trigger_on_poweredOff":"ignore","ObjectStatePolicy/PowerState/warning_for_uptime_greater_than_days":999}
*************************** 2. row ***************************
  object_uuid: 0x499A6581CE425D67B70D22D33CE5DEC1
object_folder: vm
     settings: {"ObjectStatePolicy/PowerState/trigger_on_poweredOff":"critical","ObjectStatePolicy/PowerState/trigger_on_suspended":"critical","ObjectStatePolicy/PowerState/trigger_on_unknown":"critical"}



+------------------------------------+------------------------------------+---------------+----------------+-------------+----------------+-------+------------------------------------+------+
| uuid                               | vcenter_uuid                       | moref         | object_name    | object_type | overall_status | level | parent_uuid                        | tags |
+------------------------------------+------------------------------------+---------------+----------------+-------------+----------------+-------+------------------------------------+------+
| 0x499A6581CE425D67B70D22D33CE5DEC1 | 0x0BD3C813BF9240FF8EE78E6E26FB44D3 | group-v140507 | AlwaysOnFolder | Folder      | gray           |     5 | 0xE3488B78CF4759CA96C2EBC4EEE5C6BD | []   |
+------------------------------------+------------------------------------+---------------+----------------+-------------+----------------+-------+------------------------------------+------+

@Thomas-Gelf
Copy link
Contributor

Looks good to me... strange

@Thomas-Gelf
Copy link
Contributor

There used to be related bugs, but as you're running v1.7.1 - they should all have been fixed. Could you please try to set it to "Trigger a warning" on some other folder between AlwaysOnFolder and your root? Does that change anything?

@Virsacer
Copy link
Contributor Author

Virsacer commented Jun 9, 2023

  [WARNING] Power State (--rule ObjectStatePolicy/PowerState)
   critical_for_uptime_greater_than_days = 999
   trigger_on_poweredOff = "warning" (inherited from Kunden)
   trigger_on_suspended = "critical" (inherited from AlwaysOnFolder)
   trigger_on_unknown = "critical" (inherited from AlwaysOnFolder)
   warning_for_uptime_greater_than_days = 999
   warning_for_uptime_less_than = 900
   \_ [WARNING] Virtual Machine has been powered off

@Virsacer
Copy link
Contributor Author

Virsacer commented Jun 9, 2023

Look like it only happens when a setting is on both root and leaf.
When it is on root and in the middle or only at the leaf it works...

@Thomas-Gelf
Copy link
Contributor

Thomas-Gelf commented Jun 9, 2023

If you remove it on the leaf, and set it one level above - does it then work?

@Virsacer
Copy link
Contributor Author

Virsacer commented Jun 9, 2023

I did some more tests and always set the same value for all three parameters:

When set only on the leaf, it works fine.
As soon as it is set on ANY other level(s), the leaf is ignored.
When it is set on multiple non-leaf levels, the lower (shortest to leaf) levels value wins (but never the leaf itself)

@Nayakum
Copy link

Nayakum commented Oct 24, 2023

Hi there, adding to this, as I'm perceiving the same issues with the same behaviour (also running v1.7.1);
when configuring "Enabled" to "Please choose" for any setting, the object is monitored as though the setting was enabled. Also, once monitoring thresholds are set on a branch, they are used before the thresholds of a leaf, even when the monitoring is set to "Pleases choose" on a branch or leaf.

Leaf:
image

Next closest branch:
image

Host in the leaf group:

[WARNING] Host System, according configured rules
   [OK] Object State Policy (--rule ObjectStatePolicy/*)
      [OK] Overall VMware Object State (--rule ObjectStatePolicy/VMwareObjectState)
       trigger_on_gray = "warning"
       trigger_on_red = "warning"
       trigger_on_yellow = "warning"
       \_ [OK] Overall VMware status is 'green'
      [OK] Power State (--rule ObjectStatePolicy/PowerState)
       critical_for_uptime_greater_than_days = 600
       critical_for_uptime_less_than = 0
       trigger_on_poweredOff = "ignore"
       trigger_on_suspended = "warning"
       trigger_on_unknown = "unknown"
       warning_for_uptime_greater_than_days = 365
       warning_for_uptime_less_than = 0
       \_ [OK] Host System is powered on
       \_ [OK] System booted 298d 2h ago
   [WARNING] Compute Resource Usage (--rule ComputeResourceUsage/*)
      [OK] CPU Usage (--rule ComputeResourceUsage/CpuUsage)
       critical_if_less_than_percent_free = 10 (inherited from [leaf])
       warning_if_less_than_percent_free = 30 (inherited from [leaf])
       \_ [OK] 3.13 GHz out of 57.5 GHz used, 54.3 GHz (94.54%) free
      [WARNING] Memory Usage (--rule ComputeResourceUsage/MemoryUsage)
       critical_if_less_than_percent_free = 2 (inherited from [leaf])
       threshold_precedence = "best_wins"
       warning_if_less_than_percent_free = 20 (inherited from [closest branch])
       \_ [WARNING] 79.32 GiB out of 511.70 GiB (15.50%) free

Additionally, the --inspect doesn't mention when a setting is inherited from the global setting "All vCenters". I find that somewhat unintuitive, since all other inheritances are shown.

@edpstiffel
Copy link

We just stumbled over this issue while trying to set individual limits for one datastore. The behaviour is exactly as it is mentioned some comments above: if there is a setting on the path to the leaf, the settings directly at the leaf are ignored.
Any idea whether there will be a fix in the nearer future?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants