With BigPanda Pragmatic AI, the logic of machine learning suggestions is explained in plain English. Administrators and ITOps team members can add situational and tribal knowledge to the logic to strengthen it on their own, without requiring expert data scientists.
By recommending new correlation patterns based on new event streams and predicting probable root cause changes, BigPanda’s AI Engine helps IT Ops teams stay on top of real-time incident detection and cut down on the time it takes to investigate an incident and resolve it.
BigPanda AI doesn’t change your data itself, but suggests potential improvements and correlations, allowing you to preview the implications on real live production data before deploying them.
- Pattern Recommendations
- Root Cause Change Prediction
As new datasets are ingested, BigPanda’s explainable AI technology suggests new correlation patterns on a regular basis. This helps maintain a high quality of correlation without the guesswork.
These correlation patterns are also editable, testable, and preview-able, giving IT Ops teams full control over the quality of correlation.
At its core, BigPanda's Algorithmic Correlation relies on pattern recognition. A pre-configured list of patterns is matched against alerts to identify alert clusters in real-time. To classify alerts into incidents, BigPanda looks at information in 4 dimensions:
- Topology (e.g., datacenter, rack, cluster)
- Context (e.g., criticality, team, customer impact)
- Alert types (e.g., network, storage, application)
Each correlation pattern defines general properties to correlate: source, timespan, common alert attributes, and a filter.
Common patterns include:
|Connectivity alerts||Alerts triggered by devices attached to a single network in a 15-minute timespan|
|Load-related alerts||Alerts triggered by multiple servers supporting a single database in a 2-hour timespan|
|Common application alerts||Alerts triggered by tools like Splunk and AppDynamics in a 30-minute timespan|
To learn more about how BigPanda merges events into alerts and clusters alerts into incidents, see the Alert Correlation Logic documentation.
In general, correlation patterns are managed by Administrators and the BigPanda Customer Success team. The Machine Learning Engine is able to supplement the human-designed patterns by autonomously searching for new correlations and suggesting new patterns.
BigPanda's Machine Learning Engine will automatically generate correlation pattern suggestions based on historical user data. Upon the integration of a monitoring tool, the review process begins and an automatically generated pattern will be suggested in the Correlation Patterns settings page in a few days.
The rate at which the first pattern is generated is dependent upon the richness and size of the available data. Over time, as more data flows through the system, additional patterns will be recommended at an increased and variable rate.
Once the Machine Learning Engine suggests a pattern, administrators can decide to activate it, reject it, or customize it within the editor. The Real-Time Preview in the patterns editor gives you instant visibility on the impact a suggested pattern would have on correlated alerts in your system.
Suggested correlation patterns are made to stand on their own, but new patterns can also be modified to complement an existing set of in-use patterns.
The end result is better correlation reach with higher quality incidents to help your team resolve issues faster.
BigPanda's Machine Learning Engine is unsupervised in function and does not require training. It works by clustering the alert stream into high-quality Incidents. It will run autonomously in the background as soon as relevant data is present. Unlike supervised machine learning, human interaction and consistent input are not required for its upkeep and efficacy.
Our unsupervised approach uniquely maintains transparency and consistency in forming Incidents but it does not enact changes. The suggestion model for generated patterns is utilized to grant administrators full discretion and control of code changes in a Production environment.
To learn more about defining and managing correlation patterns, see our Manage Alert Correlation guide.
Once BigPanda is integrated with all your change feeds/tools, BigPanda aggregates your change data (new changes and updates to changes), and normalizes them.
BigPanda's explainable AI technology then analyzes these changes against your existing incidents in real-time, to identify and surface the probable root cause change right alongside that incident.
BigPanda’s Root Cause Changes feature collects change information through inbound change integrations and correlates changes to potentially related incidents.
Changes that are correlated strongly enough are floated up onto the Incident Overview section of the incident details as suggested related changes.
BigPanda’s Root Cause Changes feature leverages a text similarity algorithm.
The algorithm runs calculations on key connections between incidents and changes, including:
- Categories - tag key/value matches
- Time Factor - how close were the change and incident
- Alerts Coverage - how many of the alerts match properties in the change
BigPanda is configured to suggest up to 2 related changes, but only changes that are highly correlated will be suggested.
Administrators can manage the algorithm to improve the quality and frequency of suggested root cause changes.
The Root Cause Analysis Algorithm is focused on finding causation, not correlation. The causation time factor is configured to consider only changes that could have affected a system long enough to create a system event.
BigPanda's algorithms automatically detect connections between changes made to the system and incidents.
As new incidents are created or new alerts join an existing incident, BigPanda calculates their match potential with each past change.
If a change with high match potential is found, BigPanda marks the change as Suspect and adds a comment to the info popup explaining why the change was marked. Suspected changes will appear on the Overview tab as well as at the top of the Changes table. Filter the table to show only suspected and matched changes by clicking the Show potential RCC only toggle.
Suggested changes can rapidly speed up the root cause investigation process by identifying potential problems right at incident detection.
BigPanda will only mark changes as Suspect (not Match) to give users the final say on whether the change is the root cause of the incident
The Text-similarity algorithm runs a calculation between incidents and changes to identify changes that may be related. This algorithm uses alert tags, details, and incident metadata to find common values between incidents and changes, using automation to replace the long man hours normally required for this process.
Once matches are found, the algorithm weighs the match to see if it is potentially a sign of root cause. To do so, the algorithm breaks incident data out into categories, or types of connection. Each category has a particular weight assigned to it, reflecting the type of relationship between changes and incidents, as not all text-matches point to shared system attributes or resources.
For example, the “IP” category of tags, with data on IP addresses is generally assigned a high weight as changes and incidents occurring at the same IP address are likely to be connected.
The algorithm then calculates a causation score based on the time frames of the incident and change, and the strength and number of matches between the two. Changes with a high causation score are surfaced as suggested related changes in the Incident Details pane.
Pragmatic AI helps hone and improve your incident correlation and root cause investigation, while still leaving the control in your hands.
Paired with the operational and system insights of BigPanda Analytics, Pragmatic AI can dramatically improve and hone your IT Operations. Leverage Analytics to dig deeper into suggested correlation patterns for specific tools or resources, or to visualize change data patterns.
Read more about BigPanda analytics and reporting options in the BigPanda Analytics documentation.
Updated 28 days ago