An alert is a particular event - or aggregated group of events - that meets a threshold requiring investigation and action.
An alert is comprised of events that represents only the latest state of an application, service or infrastructure. The aggregated events that makeup an alert can be viewed in a timeline that shows the affected state changes corresponding to the application, service or infrastructure.
Monitoring tools generate events when potential problems are detected in your infrastructure. Over time status updates and repeat events may occur from the same system issue. In BigPanda, raw event data is merged into a singular alert so that you can visualize the life cycle of a detected issue over time.
When an alert is resolved and then reopened during an incident it belongs to the same entity as the original alert and will appear on the same line within the Incident Timeline view. This makes it easy for you to spot reoccurring alerts.
For example, a CPU load alert may start with a warning event, then increase in severity with a critical event, and finally get resolved with a resolution event. All three of these events will be merged into a single alert. Common events that are sent to BigPanda include: “CPU > 95% for more than 5 minutes” and “Port X on Router ABC down”
BigPanda correlates related alerts into incidents for visibility into high-level, actionable problems.
1 - Incident
2 - Alert
3 - Event
Some monitoring tools refer to ‘events’ as ‘alarms’ or ‘alerts.’ In BigPanda documentation ‘alert’ is always used to refer to the complete lifecycle of an event.
Updated 13 days ago