Unified Analytics Key Metrics

Unified Analytics leverages several key metrics to give you insight into your incident management workflows.

Unified Analytics uses standard reporting definitions, calculated fields, and dashboards to provide insight into your incident management workflows. Use the definitions below to gain an understanding of the metrics provided by the data in Unified Analytics.

🚧

In-Progress Rollout

Your organization may not be using this feature yet as it is rolling out to our BigPanda customers through a staged migration. To find out more about the migration requirements and when your organization can move to the new functionality, reach out to us at [email protected].

If your organization is still using the previous analytics, you can learn more about creating reports in the BigPanda Reporting documentation

Business Reporting Definitions

Unified Analytics uses standard fields and terminology to describe elements and stages of the incident management process.

BigPanda’s specific definition of these terms is based on industry standards and best practices to help you measure and track business and operational metrics.

Term

Description

Business Segment

A business portion or unit that drives unique action or functionality. For example, a high-level Business Unit, Team, Group, or Organization.

Business Category

The type of alert. For example, Application, Infrastructure, or Service

Config Item

An aggregated field used to capture the relevant configuration item. The default is host and service, but a custom field can also be set.

Actionable Incident

An incident that contains high quality alerts enriched with both technical and business context.

Unified Analytics uses the following criteria to determine if an incident is actionable:

  • Explicitly defined as actionable using bp_v_actionable tag
  • Enriched with business context (using the bp_v_business_segment tag)
  • Incident was acted upon
  • Not defined as noise (using the bp_v_alert_noise tag)

Incident Outliers

When calculating MTTx metrics, a small number of incidents with very large times can significantly skew the total numbers. This field is used to filter out these incidents and normalize the MTTx measurement.

By Default, Incident Outliers are excluded from specific calculations to provide more accurate results. You can use the BP Incident Outlier field to turn this on or off.

Mean Time to X (MTTx)

Mean Time to X (MTTx) measures the average time it takes to perform an action within BigPanda. MTTx metrics calculate the performance of your Incident Management team.

Mean Time to Assign (MTTA) - The average amount of time it takes the IT Ops team to assign the incident. In BigPanda, MTTA is calculated based on when the assign action is used.

Mean Time to Engage (MTTE) - The average amount of time it takes the IT Ops team to engage in resolving the incident. In BigPanda, this is measured by the time it takes to perform an action other than assign.

Mean Time to Fix (MTTF) - The average amount of time between engagement and resolution. In BigPanda, MTTF is automatically calculated from the start to the end of the incident.

Mean Time to Resolve (MTTR) - The average amount of time it took to get back to service. MTTR looks at the repair of alert symptoms as opposed to the complete resolution of the incident. In BigPanda, it is calculated from the start of the first alert, to the resolution of the last alert.

Mean Time Between Failures (MTBF)

The average amount of time between failures. MTBF measures issue recurrence, or the time between when an incident is resolved and when/how often it reoccurs.

MTTR Volatility

Measures the consistency in the time it takes to resolve incidents. MTTR Volatility is measured based on the ratio between average MTTR and the median.

Compression Ratio

The percent of alerts that were correlated and deduplicated into incidents.

Enrichment Hit Rate

The percent of alerts that were enriched in BigPanda.

BigPanda Workload

The number of resolved incidents multiplied by MTTR.

Measures the impact of BigPanda on the overall team efficiency.

Quarter over Quarter (QoQ)

Quarter over Quarter (QoQ) is the change from the previous quarter to the last quarter in the timeframe.

If a timeframe has multiple quarters, the comparison will be from the last quarter in the timeframe to the previous one.

If a timeframe has only one quarter, the comparison will be between that quarter and the one before it.

More BigPanda definitions can be found in the BigPanda Glossary.

To learn more about how to best understand and leverage these terms during reporting, check out the Unified Analytics Terminology guide in the BigPanda University Unified Analytics Course.

Calculated Fields

The following dashboard fields are calculated within the cached data model:

Term

Dashboard Field

Calculation

Incident Closure

BP Incident Closure

Based on actions, incidents are categorized as:

Shared - The incident was manually shared

Auto Shared - The incident was AutoShared

Missed - The incident resolved itself

No Action - The incident was resolved manually with no other action

Resolved in BP - The incident was resolved within BigPanda

Still Open - The incident is still open

Incident Outliers

BP Incident Closure

Incidents are set as an outlier if:

Assign time is over a day
Engage time is over a day
Resolution time is over a week

If an incident isn’t an outlier, the setting is Valid

Resolution Bucket

BP Incident Resolution Bucket

Resolution buckets include the following:

Still Open
Under 5 min
5 - 30 min
30 - 60 min
1 - 4 hours
4 - 24 hours
1 - 7 days
Over a week

Alert Quality

BigPanda categorizes alert quality based on the amount of enrichment or context available.

The following alert quality categories are available:

Low Quality Alerts - Alerts that lack key information or which were identified as irrelevant or misconfigured. Low quality alerts do not meet the logic criteria for Medium or High Quality Alerts.

Medium Quality Alerts - An alert that contains the minimum level of information and technical context to support operator action. These alerts lack some valuable elements such as business context, dependencies, or resolution steps.

📘

Medium Quality Alert Logic

The logic for defining medium quality alerts is CI AND Check.

High Quality Alerts - The most actionable type of alerts with all important data needed for a team to triage and resolve the alert. High quality alerts usually include data on both technical and business context.

📘

High Quality Alert Logic

High Quality Alerts must meet the same criteria as Medium Quality Alerts. The additional logic for defining high quality alerts is Ownership & Routing AND Business Impact AND (Runbooks OR Dependency OR Enriched).

The following categories and fields define medium alert quality:

Category

Definition

Default Fields

CI

Configuration item.

normalized_host
bp_application
bp_service
service
bp_v_config_item

Check

Symptom of the problem.

normalized_check

The following categories and fields define high alert quality:

Category

Definition

Default Fields

Ownership & Routing

The relevant part of the business and/or team that should handle it.

bp_assignment_group
assignment group
bp_group
bp_v_business_segment

Business Impact

The impact of the alert on the business.
Can be priority level, application tiers, etc.

bp_priority
priority
bp_impact
bp_urgency

Runbooks

Information about how the alert should be handled.

bp_runbook
kb
runbook_url

Dependency

Understanding of the services or application it impacts.

impacted_service
bp_application
application
bp_service
service

Enriched

If the alert was enriched or not.

enrichment
enriched

📘

If you need to define a field outside of the defaults for an alert quality category, reach out to BigPanda Support.

Alert Analysis

The Alert Analysis dashboard dives into metrics on alert payloads sent to BigPanda.

Widget

Description

Alert Quality and Reduction Over Time

Displays the alert payload quality over time. The shaded area shows the total incidents after correlation.

The line shows the Noise Reduction percentage over the time.

Total Alerts by Hour

The hourly number of alert status changes by day of the week. (Sunday is 0)

Top Alerting Applications

The total number of incoming alerts and incidents created for each inbound tool. Tools are listed in order from highest to lowest alert count.

Top Alerting Hosts

The total number of incoming alerts and incidents created for each individual host. Hosts are listed in order from highest to lowest alert count.

Top Alerting Checks

The total number of incoming alerts and incidents created for each individual check. Checks are listed in order from highest to lowest alert count.

Top Recurring Alerts (MTBF)

Alerts that are sent to BigPanda most often, with MTBF and most recent occurrence listed.

Top Noisy Alerts

Alerts listed by the average number of status changes those alerts report to BigPanda.

Enrichment

The Enrichment dashboard provides an overview of correlation and enrichment trends within your incidents.

WidgetDescription
Total AlertsThe total number of alerts received by BigPanda in the dashboard time period.
Enriched AlertsThe total number of alerts enriched by BigPanda in the dashboard time period.
Enrichment %The percent of received alerts that were enriched by BigPanda.
Enrichment Hit RateThe number of enriched and unenriched alerts entering BigPanda per day over a period of time. The line displays the Enrichment Hit Rate over time.
Alerts with No EnrichmentA count of the alerts that were not enriched, organized by host.

Executive Summary

The Executive Summary dashboard provides a high-level overview of alert and compression metrics for leadership.

Noise Reduction

BigPanda Noise Reduction is defined by the amount of alerts that are prevented from reaching the end user. This process is done through alert aggregation, enrichment, incident correlation, and filtering.

Widget

Description

Alert Noise Reduction

The percentage of duplicate and irrelevant alerts that BigPanda saved from reaching the end user. The process is done through alert aggregation, incident correlation, and filtering.

The following calculation is used:
Sum of Alert Status Changes / # Actionable Incidents.

Alert Noise Reduction Funnel

The funnel showcases the workload automation BigPanda brings in alert processing from deduplication down to actionable incidents.

This illustrates the main stages an alert will go through and provides the visualization needed to have meaningful conversations about the different phases.

Alert Quality and Reduction Over Time

Displays the alert payload quality over time. The shaded area shows the total incidents after correlation.

The line shows the Noise Reduction percentage over time.

MTTR Breakdown

MTTR Breakdown is used to measure Incident Management Performance using the BigPanda event data. Many teams use MTTR as the main KPI and then break it down to its different components for more granular reporting.

Widget

Description

MTTR in Min

The average MTTR in minutes and the MTTR percent decreased QoQ.

Total MTTx Breakdown

The breakdown of total MTTx. Allows managers to see where time is allocated and if there might be problems in their incident management process.

MTTR Breakdown Over Time

The MTTx over time and the volatility of the MTTR.

Volatility is a measurement of the spread of the values of MTTx within a given data set. If the values are all relatively consistent, volatility is low. If MTTx values are inconsistent, the average may be unaffected, but volatility is high.

Team Productivity

Team Productivity helps you understand the impact on Team workload and performance while using BigPanda. The two KPIs used are Alert Volume (amount of work) and Operational workload (time spent).

Widget

Description

Team Workload - QoQ Change

The overall efficiency improvements across teams. Shows how alert volume and operational workload is changing over time.

Alert volume - Displays the percent increase or decrease of alerts.

Operational Workload - Measures the impact of BigPanda on overall team efficiency. Operational workload is calculated using (Incident Count * MTTR)

Top Team KPIs

The three top level Team KPIs that indicate mature handling of incidents in BigPanda.

KPIs:
% Actioned Incidents - Percentage of Incidents that were handled by the team (as opposed to missed)

% Resolved < 1 Hour - Percentage of Incidents that were resolved in less than an hour

% Resolved in BP - Percentage of incidents that were resolved without Sharing to external targets

of Incidents Per Resolution Bucket Over Time

The number of incidents handled for each resolution bucket, over time. Allows managers to investigate why incidents may be taking longer than expected.

Resolution buckets:
Still Open
Under 5 min
5 - 30 min
30 - 60 min
1 - 4 hours
4 - 24 hours
1 - 7 days
Over a week

The line charts the overall BigPanda Workload trend over time.

To learn more about how to best understand and leverage the data of the Executive Summary, check out the Unified Analytics Terminology guide in the BigPanda University Unified Analytics Course.

ITSM Tools

The ITSM Tools dashboard provides metrics for your outbound integrations that receive BigPanda incidents.

Widget

Description

Tickets Created

The number of all outbound shares.

Ticket Creation %

The percentage of incidents that resulted in an outbound ticket.

Auto Tickets

The number of outbound shares triggered by AutoShare rules.

Manual Tickets

The number of outbound shares created manually by a user in the UI.

Ticket Compression %

The alert compression percentage for incidents that were shared.

Shares by Tool

The percentage of outbound ticket shares sent to individual tool integrations.

Selecting the Others category in the chart will expand into a new pie chart listing only tools with smaller percentages.

Ticket Compression

The number of incidents that were shared, broken out by week.

The trendline tracks the overall alert compression percentage for incidents shared each week.

Top Tickets by Host

Tickets that have the most updates to share tools, sorted by host system.

Shares Over Time

The number of tickets shared to each tool per week.

Monitoring Sources

The Monitoring Sources dashboard provides detailed metrics on inbound integrations sending alert data to BigPanda.

Widget

Description

Monitoring Sources by Volume

The percentage of alert status changes sent by each inbound integration.

Monitoring Sources 30 Day Trend

The number of Critical or Warning alert status changes sent by inbound integrations per week.

Monitoring Tool Assessment

The flow of alerts from individual monitoring sources to their incident closure within BigPanda.

Alerts are classified as either high, medium, or low quality based on the level of BigPanda enrichment for the alert. See Alert Quality for more information.

Monitoring Sources by Status

The number of Critical and Warning alert status changes sent by each inbound integration.

Self Healed Alerts Trending

The number of alerts that were resolved without operator action within BigPanda per week.

The dotted line tracks the trend over the full reporting period.

MTTX Breakdown

The MTTX Breakdown dashboard dives into trends and statistics for organizational MTTx measures.

Widget

Description

Mean Time to Assign (Min)

The average amount of time it took the IT Ops team to assign the incident.

Mean Time to Engage (Min)

The average amount of time it took the IT Ops team to engage in resolving the incident. In BigPanda, this is measured by the time it takes to perform an action other than assign.

Mean to Resolve (Min)

The average amount of time it took to get back to service. MTTR looks at the repair of alert symptoms as opposed to the complete resolution of the incident. In BigPanda, it is calculated from the start of the first alert, to the resolution of the last alert.

MTTX breakdown over time

MTTx values per week.

The trendline tracks the MTTR Volatility over time.

MTTR by Application (minutes)

The MTTR for alerts with a Critical and Warning severity broken out by source integration.

Raw Data

The Raw Data dashboard contains all of your reporting data in simple table format ready for exporting or in-depth investigation.

Widget

Description

Raw Data

Lists data for each alert, in a sortable, interactive table. Blue hyperlinks enable operators to immediately jump to incidents in BigPanda.

This table can be exported to a CSV for additional reporting in an external tool.

Teams Efficiency

The Teams Efficiency dashboard explores user actions and behavior within BigPanda.

WidgetDescription
Incident CountThe total number of incidents belonging to this business segment that entered BigPanda during the reporting time period.
AssignedThe number of incidents belonging to this business segment that were assigned to a user during the reporting time period.
CommentedThe number of incidents belonging to this business segment that were commented on during the reporting time period.
SharedThe number of incidents belonging to this business segment that were auto- or manually-shared during the reporting time period.
Manually ResolvedThe number of incidents belonging to this business segment that entered BigPanda during the reporting time period.
Merged/SplitThe number of incidents belonging to this business segment that were merged or split during the reporting time period.
% of IncidentsThe percentage of incidents that entered BigPanda during the reporting time period that belonged to this business segment.
% AssignedThe percentage of incidents belonging to this business segment that were assigned to a user during the reporting time period.
% CommentedThe percentage of incidents belonging to this business segment that were commented on during the reporting time period.
% SharedThe percentage of incidents belonging to this business segment that were auto- or manually-shared during the reporting time period.
% Manually ResolvedThe percentage of incidents belonging to this business segment that were manually resolved during the reporting time period.
% Merged/SplitThe percentage of incidents belonging to this business segment that were merged or split during the reporting time period.
User StatsThe number of times each user took specific actions within BigPanda toward managing incidents.
Total ActionsThe number of times each action was taken on incidents per day.
User Actions per DayThe number of actions performed by each user per day.

Next Steps

Learn how to Manage Unified Analytics Reports

Learn the basics of Viewing Unified Analytics

Learn how to navigate the Analytics tab