Manage the Root Cause Changes Algorithm

The text-similarity algorithm can be adjusted to better match the patterns within your system.

BigPanda’s Root Cause Changes feature collects change information through inbound change integrations and correlates changes to potentially related incidents.

Changes that are correlated strongly enough are floated up onto the Incident Overview section of the incident details as suggested related changes.

BigPanda’s Root Cause Changes feature leverages a text similarity algorithm.

The algorithm runs calculations on key connections between incidents and changes, including:

  • Categories - tag key/value matches
  • Time Factor - how close were the change and incident
  • Alerts Coverage - how many of the alerts match properties in the change

BigPanda is configured to suggest up to 2 related changes, but only changes that are highly correlated will be suggested.

Administrators can manage the algorithm to improve the quality and frequency of suggested root cause changes.

👍

The Root Cause Analysis Algorithm is focused on finding causation, not correlation. The causation time factor is configured to consider only changes that could have affected a system long enough to create a system event.

🚧

When calculating time ranges, the algorithm rounds start times up and end times down to the nearest hour. When searching changes based on expected matches, you may see different results than the algorithm.

Text-Similarity Algorithm

The Text-similarity algorithm runs a calculation between incidents and changes to identify changes that may be related. This algorithm uses alert tags, details, and incident metadata to find common values between incidents and changes, using automation to replace the long man hours normally required for this process.

Once matches are found, the algorithm weighs the match to see if it is potentially a sign of root cause. To do so, the algorithm breaks incident data out into categories, or types of connection. Each category has a particular weight assigned to it, reflecting the type of relationship between changes and incidents, as not all text-matches point to shared system attributes or resources.

For example, the “IP” category of tags, with data on IP addresses is generally assigned a high weight as changes and incidents occurring at the same IP address are likely to be connected.

The algorithm then calculates a causation score based on the time frames of the incident and change, and the strength and number of matches between the two. Changes with a high causation score are surfaced as suggested related changes in the Incident Details pane.

Sample Text-Similarity SuggestionSample Text-Similarity Suggestion

Sample Text-Similarity Suggestion

Configuring the Text-Similarity Algorithm

🚧

Adjusting the text-similarity algorithm is a complex back-end process requiring close coordination with BigPanda support. Reach out to us at [email protected] if you are interested in configuring your text-similarity algorithm.

The text-similarity algorithm can be adjusted to better reflect the relationship between changes and incidents. You are able to change categories, adjust weight given to specific types of matches, or remove specific categories entirely.

You are able to test algorithm changes before activating them using score reports.

👍

Rich alert and change data is vital to the success of the text-similarity algorithm. The more information available in incoming tags and description fields, the easier it is for the algorithm to spot matches.

Default Text-Similarity Settings

Main Default Categories

Category Nametag_keystag_valuesDefault Weight
Configuration Item"cmdb_ci", "yp_service_id", "ci", "configuration_item"5
AWS Region["ue1", "uw1", "uw2", "ew1", "ec1", "an1", "an2", "as1", "as2", "se1","us-east-2","us-east-1","us-west-1","us-west-2","ap-east-1",...]1
Environment"environment","environments","env","envs","tier","tiers","stage"2
Team"responsible_group", "teams", "owners", "assignment_group", "groups", "team", "owner", "group"3
Application"application", "services", "applications", "service", "business_service", "app", "apps"4
Host"hostnames", "server", "hostname", "host", "instances", "host_name", "instance", "servers", "object", "hosts", "nodes", "host_names", "node", "objects", "device", "devices"4

Additional Default Categories

CategoryRegextag_keysDefault Weight
IP(?:[0-9]{1,3}.){3}[0-9]{1,3}7
Email(\[email protected]\S+)2
URL^(https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_!:,.;]*[-a-zA-Z0-9+&@#/%=~_]4
MAC(([0-9A-F]{2}[:-]){5}([0-9A-F]{2}))5
Portnumber between 1 and 65535must contain ‘port’4
PrimaryKey3
SecondaryKey2
SourceSystem3

Other RCC Input

TypeDescriptionAdjustment ExamplesDefault Setting
Change TimeframeMaximum possible difference between alert time and change end timeFor an org that follows business hours, increasing to 3 days to capture weekends1 Day
Change FieldsWhich change fields are included in the calculationDeselect build-notes that might muddle the actual change impactAll Tags
Change LimitMaximum number of changes that can be Suspect for one incidentDecrease limit if too many faulty suggestions are slipping through2
Excluded Change StatusesExclude all changes with a specific statusExcluding In Development, Unscheduled, or other non-actionable statusesCanceled
Minimum Score ThresholdLowest possible score that can appear as a root cause subjectUse this score and custom weights to lock categories together
(e.g. a business unit category and the minimum score are both 30, so only changes that match the business unit will be suggested)
0
SplittersDelimiters to split text (should prod-correlation-12 be treated as 1 whole or 3 separate)Remove the hyphen to treat “prod-correlation-12” as a single option and narrow results[ " ", "\n", "[", "]", "(", ")", """, "'", "*", ",", "::" ]
Stop WordsWords to exclude from correlationAdd words that are used commonly across an org to keep them from muddying resultsA list of 900 words

Reporting on Root Cause Changes

If your organization has already enabled root cause changes, the BigPanda team is able to pull a change correlation report on past algorithm suggestions.

The report includes information on:

  • Incident time
  • Incident id
  • Change id
  • Entity id
  • What value triggered the match
  • Match type (text-similarity category)
  • Incident tag key
  • Change tag key
  • Time and total correlation score
  • Whether or not the suggested match was manually marked as Suspect, Match, or None.
    Contact us at [email protected] if you would like to access the change correlation report.

Next Steps

Learn more about BigPanda's Incident Intelligence

Dig deeper into Correlating Changes with Incidents

Begin integrating Change Integrations