[Check_mk (english)] Event console: Cancelling a failure event raised by missing expected events
we have a system that is programmed to send "heartbeats" every 10 minutes.
I've configured two rules to Event Console:
- Match text and host (as appropriate)
- Text to cancel event: Same text as above
- State CRIT
- Expect regular messages every 15 minutes
- Rewrite hostname (to map the hostname to the correct name
- Same rules above, except state OK, and no "expect regular messages" configured
- Also "delete event immediately after action", since we don't need to record every arriving heartbeat
Then, in WATO, I've configured this event to be mapped to a service on a host so that we can see is directly.
Idea here is that if a heartbeat does not arrive at least once within 15 minutes, the service state visible
goes to crit.
This part works.
However, how do I cancel this properly, so that the CRIT state of the service gets reset to OK after a
heartbeat does come in after an outage?
I can see, even after heartbeats have resumed, that the "CRIT" event is still active for the host.