Root Cause Changes (RCC) dramatically speeds up the process of identifying the changes that cause outages and incidents in your environments.
When you integrate your CI/CD and change management tools with BigPanda, you can normalize and aggregate change data alongside incidents. This comprehensive enrichment gives you deep insights into the changes that may have triggered an issue.
BigPanda analyzes each change against active existing incidents in real-time, so your teams don’t have to manually dig through hundreds or thousands of potentially related change events. Changes that are suspected as a potential root cause are flagged and added to the incident details.
RCC uses an algorithm based on natural language processing and vector space models to compare changes to active incidents. This allows BigPanda to identify the connections between alerts and change data with confidence that suspected changes are statistically relevant.
- Connect a variety of change tools to BigPanda using standard and custom integrations.
- View changes that occurred prior to and during an incident to easily identify changes that may have been related to the incident.
- Visualize metrics related to change data using dashboards in Unified Analytics.
Changes (RCC) API
Root Cause Changes can also be viewed and managed with the Changes (RCC) API.
BigPanda includes several standard integrations ready to connect your change feeds to BigPanda. You can also build custom integrations with the Root Cause Changes (RCC) REST API. These integrations collect and normalize data from your various tools and bring them together into BigPanda’s single pane of glass.
Change integrations give your Operations teams deeper insights into the system changes that may be triggering system events and outages. This gives Operations teams clear visibility into changes pushed by Developers, empowering the two teams to collaborate more proactively and effectively.
Learn more about integrating your change tools with BigPanda in the Integrate with BigPanda documentation.
Changes that occurred shortly before or during the incident are displayed in the Incident Details pane within the Changes tab. Here operators can see vital information about the change, including status, summary, and start time.
Meanwhile, BigPanda will automatically compare change data to incoming incidents, looking for potential incident causes. If a change is highly correlated with an incident, it will appear as a Suspect in the Incident Details pane. RCC suspects are identified with details about why BigPanda thinks the change may be related to the ongoing incident. While BigPanda is configured to suggest up to 5 related changes, only changes that are highly correlated will be suggested.
Operators can then search the table, dig into change details, and mark whether a change should be matched to the incident. By marking the results of change investigation in the console, teams can collaborate together to identify the real cause.
Use the Show potential RCC only toggle to limit the change table to only show changes BigPanda has identified as RCC suspects.
Learn more about how operators can leverage changes in the BigPanda console in the Remediate Incidents documentation.
Root Cause Changes (RCC) leverages an algorithm based on natural language processing and vector space models. BigPanda intuitively compares the complex and discordant data from monitoring and change tools, while considering the context and timing of causal relationships.
RCC runs calculations on key connections between incidents and changes, including:
- Time Frame - how close were the change and incident
- Alerts Coverage - how many of the alerts match properties in the change
- Categories - groups of specific details defined for weighting and parsing matches
Each incident-change match is given a causation score based on these calculations, with a higher score indicating a more likely suspect. Changes with a high causation score are surfaced in the Incident Details pane as RCC Suspects.
RCC is focused on finding causation, not correlation. Only changes that have been implemented long enough to create a system event and scheduled change windows which have recently started are considered as potential causes.
In addition, changes that happened too far before the incident are also excluded, as incidents usually happen shortly after system changes.
Many incidents will have at least one alert that matches the data for recent changes. When only a single alert matches, the relationship between the incident and change may not be causal, especially in complex incidents with multiple downstream impacts.
To help identify strong causal relationships, RCC considers the percentage of alerts in an incident that align with the change details. Higher percentages indicate a closer connection between the incident and change.
RCC uses change details, alert tags, and incident metadata to find common values between incidents and changes.
However, not all matches imply strong connection or potential cause.
To consider the context and relationships between data, RCC breaks incident data into a hierarchy of categories weighted by importance, based on expected incident and change alignment. Different weights and parsing rules apply to tag matches in each category, making sure that matches reflect the relationship of shared system attributes and resources.
Your default RCC category configuration is built on common industry practices, system topology, and tags and processes unique to your organization.
RCC works best when it has rich data and meaningful relationships identified for your organization.
The more standardized information available in incoming tags and description fields, the easier it is for BigPanda to accurately spot causality. If you’d like to improve your RCC results, high quality enrichment and tag normalization is an important start.
For even more refinement of results, you can request modifications to your RCC category and parsing configuration. This is a complex back-end process requiring close coordination with BigPanda support. Reach out to us at [email protected] if you are interested in adjusting your RCC configuration.
Learn more about BigPanda's Incident Intelligence
Dig deeper into Correlating Changes with Incidents
Begin integrating Change Integrations
Updated about 1 month ago