Unified Analytics Key Metrics
Unified Analytics leverages several key metrics to give you insight into your incident management workflows.
Unified Analytics uses standard reporting definitions, calculated fields, and dashboards to provide insight into your incident management workflows. Use the definitions below to gain an understanding of the metrics provided by the data in Unified Analytics.
In-Progress Rollout
Your organization may not be using this feature yet as it is rolling out to our BigPanda customers through a staged migration.
If your organization is still using the previous analytics, you can learn more about creating reports in the BigPanda Reporting documentation
BigPanda University Training
Learn how to maximize your Unified Analytics dashboards with the BPU Unified Analytics Certification. Learn best practices and advanced dashboard management, and test your knowledge to receive the official BigPanda Unified Analytics Credly badge.
In order to start the course, you may need to register for or log in to BPU.
Calculation Tags in Unified Analytics
To calculate analytics in standard dashboards, BigPanda uses specific tag fields. Some of these tags do not appear in other areas of BigPanda.
bp_*
tags are normalized versions of common tags such ashost
orapplication
. You may have these tags configured directly within alert enrichment, or they may have been created specifically for Unified Analytics.
bp_v_*
tags are the calculation of two or more tags in BigPanda. These tags exist in Unified Analytics, but not in other areas of BigPanda.The decision about which BigPanda tags will be aggregated or normalized into calculation tags happens during the Unified Analytics onboarding process.
During Unified Analytics onboarding, make sure to clearly document which system tags are being used for analytics calculations within your organization.
Business Reporting Definitions
Unified Analytics uses standard fields and terminology to describe elements and stages of the incident management process.
BigPanda’s specific definition of these terms is based on industry standards and best practices to help you measure and track business and operational metrics.
Term | Description |
---|---|
Business Segment | A business portion or unit that drives unique action or functionality. For example, a high level Business Unit, Team, Group, or Organization. Default fields for business segment include: bp_v_business_segment bp_group bp_assignment_group assignment_group |
Business Category | A logical type of alert that can be connected to a meaningful function, service, or a recurring topic. This can be a function like infra, networking, application, etc. Or, a more technical alert category such as latency, load, etc. Default fields for business category include: bp_v_business_category bp_category alert_category |
Config Item | An aggregated field used to capture the relevant configuration item. Default fields for config item includes: application bp_application service bp_service bp_v_config_item Custom fields along with the defaults can be set. |
Actionable Incident | An incident that contains high quality alerts enriched with both technical and business context. Unified Analytics uses the following criteria to determine if an incident is actionable: - Incident was explicitly defined as actionable using bp_v_actionable tag - Incident was enriched with business context (using the bp_v_business_segment tag) - Incident was acted upon - Incident was not defined as noise (using the bp_v_alert_noise tag) The default field for actionable incidents is bp_v_actionable |
Incident Outliers | When calculating MTTx metrics, a small number of incidents with very large times can significantly skew the total numbers. This field is used to filter out these incidents and normalize the MTTx measurement. By Default, Incident Outliers are excluded from specific calculations to provide more accurate results. You can use the BP Incident Outlier field to turn this on or off. Incidents are set as an outlier if: - The Assign time is over a day - The Engage time is over a day - The Resolution time is over a week |
Mean Time to X (MTTx) | Mean Time to X (MTTx) measures the average time it takes to perform an action within BigPanda. MTTx metrics calculate the performance of your Incident Management team. Mean Time to Assign (MTTA) - The average amount of time it takes the IT Ops team to assign the incident. In BigPanda, MTTA is calculated based on the time until the assign action is used. Mean Time to Engage (MTTE) - The average amount of time it takes the IT Ops team to engage in handling the incident. In BigPanda, this is measured by the time it takes to perform an action other than assign. Mean Time to Fix (MTTF) - The average amount of time between engagement and resolution. In BigPanda, MTTF is automatically calculated from the time someone performs an action on the incident other than assign, to the resolution of the incident. Mean Time to Resolve (MTTR) - The average amount of time it took to get back to service. MTTR looks at the repair of alert symptoms as opposed to the complete resolution of the incident. In BigPanda, it is calculated from when the first event was received, to the resolution of the last alert. |
Mean Time Between Failures (MTBF) | The average amount of time between failures. MTBF measures issue recurrence, or the time between when an incident is resolved and when/how often it reoccurs. |
MTTR Volatility | Measures the consistency in the time it takes to resolve incidents. MTTR Volatility is measured based on the ratio between average MTTR and the median. |
Compression Ratio | The percent of alerts that were correlated and deduplicated into incidents. |
Enrichment Hit Rate | The percent of alerts that were enriched in BigPanda. |
BigPanda Workload | The number of resolved incidents multiplied by MTTR. Measures the impact of BigPanda on the overall team efficiency. |
Quarter over Quarter (QoQ) | Quarter over Quarter (QoQ) is the change from the previous quarter to the last quarter in the timeframe. If a timeframe has multiple quarters, the comparison will be from the last quarter in the timeframe to the previous one. If a timeframe has only one quarter, the comparison will be between that quarter and the one before it. |
Alert Status | The current status of the alert. Possible alert statuses include Ok, Critical, or Warning. When alert status changes are counted, this includes the initial status. Changes to the OK or Resolved status are not counted. |
More BigPanda definitions can be found in the BigPanda Glossary.
To learn more about how to best understand and leverage these terms during reporting, check out the Unified Analytics Terminology guide in the BPU Unified Analytics Course.
Calculated Fields
The following dashboard fields are calculated within the cached data model:
Term | Dashboard Field | Calculation |
---|---|---|
Incident Closure | BP Incident Closure | Based on actions, incidents are categorized as: Shared - The incident was manually shared Auto Shared - The incident was AutoShared Auto Resolved - The incident was resolved externally Missed - The incident resolved itself without intervention No Action - The incident was resolved manually with no other action Resolved in BP - The incident was resolved within BigPanda Still Open - The incident is still open |
Incident Outliers | BP Incident Closure | Incidents are set as an outlier if: Assign time is over a day Engage time is over a day Resolution time is over a week If an incident isn’t an outlier, the setting is Valid |
Resolution Bucket | BP Incident Resolution Bucket | Resolution buckets include the following: Still Open Under 5 min 5 - 30 min 30 - 60 min 1 - 4 hours 4 - 24 hours 1 - 7 days Over a week |
Alert Quality
BigPanda categorizes alert quality based on the amount of enrichment or context available.
The following alert quality categories are available:
Low Quality Alerts - Alerts that lack key information or which were identified as irrelevant or misconfigured. Low quality alerts do not meet the logic criteria for Medium or High Quality Alerts.
Medium Quality Alerts - An alert that contains the minimum level of information and technical context to support operator action. These alerts lack some valuable elements such as business context, dependencies, or resolution steps.
Medium Quality Alert Logic
The logic for defining medium quality alerts is CI AND Check.
The following categories and fields define medium alert quality:
Category | Definition | Default Fields |
---|---|---|
CI | Configuration item. | normalized_host bp_application bp_service service bp_v_config_item |
Check | Symptom of the problem. | normalized_check |
High Quality Alerts - Contains all important data needed for a team to triage and resolve the alert, with focus on business context.
High Quality Alert Logic
High Quality Alerts must meet the same criteria as Medium Quality Alerts. The additional logic for defining high quality alerts is Ownership & Routing AND Business Impact AND (Runbooks OR Dependency OR Enriched).
The following categories and fields define high alert quality:
Category | Definition | Default Fields |
---|---|---|
Ownership & Routing | The relevant part of the business and/or team that should handle it. | bp_assignment_group assignment group bp_group bp_v_business_segment |
Business Impact | The impact of the alert on the business. Can be priority level, application tiers, etc. | bp_priority priority bp_impact bp_urgency |
Runbooks | Information about how the alert should be handled. | bp_runbook kb runbook_url |
Dependency | Understanding of the services or application it impacts. | impacted_service bp_application application bp_service service |
Enriched | If the alert was enriched or not. | enrichment enriched |
If you need to define a field outside of the defaults for an alert quality category, reach out to BigPanda Support and request a product change.
Alert Analysis
The Alert Analysis dashboard dives into metrics on alert payloads sent to BigPanda.
Widget | Description |
---|---|
Alert Quality and Reduction Over Time | Displays the alert payload quality over time. The shaded area shows the total incidents after correlation. The line shows the Noise Reduction percentage over time. |
Total Alerts by Hour | The hourly number of alert status changes by day of the week. (Sunday is 0) |
Top Alerting Applications | The total number of incoming alerts and incidents created for each inbound tool. Tools are listed in order from highest to lowest alert count. |
Top Alerting Hosts | The total number of incoming alerts and incidents created for each individual host. Hosts are listed in order from highest to lowest alert count. |
Top Alerting Checks | The total number of incoming alerts and incidents created for each individual check. Checks are listed in order from highest to lowest alert count. |
Top Recurring Alerts (MTBF) | Alerts that are sent to BigPanda most often, with MTBF and most recent occurrence listed. |
Top Noisy Alerts | Alerts listed by the average number of status changes those alerts report to BigPanda. |
Enrichment
The Enrichment dashboard provides an overview of correlation and enrichment trends within your incidents.
Widget | Description |
---|---|
Total Alerts | The total number of alerts received by BigPanda in the dashboard time period. |
Enriched Alerts | The total number of alerts enriched by BigPanda in the dashboard time period. |
Enrichment % | The percent of received alerts that were enriched by BigPanda. |
Enrichment Hit Rate | The number of enriched and unenriched alerts entering BigPanda per day over a period of time. The line displays the Enrichment Hit Rate over time. |
Alerts with No Enrichment | A count of the alerts that were not enriched, organized by host. |
Executive Summary
The Executive Summary dashboard provides a high-level overview of alert and compression metrics for leadership.
Noise Reduction
BigPanda Noise Reduction is defined by the amount of alerts that are prevented from reaching the end user. This process is done through alert aggregation, enrichment, incident correlation, and filtering.
Widget | Description |
---|---|
Alert Noise Reduction | The percentage of duplicate and irrelevant alerts that BigPanda saved from reaching the end user. The process is done through alert aggregation, incident correlation, and filtering. The following calculation is used: Sum of Alert Status Changes / # Actionable Incidents. |
Alert Noise Reduction Funnel | The funnel showcases the workload automation BigPanda brings in alert processing from deduplication down to actionable incidents. This illustrates the main stages an alert will go through and provides the visualization needed to have meaningful conversations about the different phases. |
Alert Quality and Reduction Over Time | Displays the alert payload quality over time. The shaded area shows the total incidents after correlation. The line shows the Noise Reduction percentage over time. |
MTTR Breakdown
MTTR Breakdown is used to measure Incident Management Performance using the BigPanda event data. Many teams use MTTR as the main KPI and then break it down to its different components for more granular reporting.
Widget | Description |
---|---|
MTTR in Min | The average MTTR in minutes and the MTTR percent decreased QoQ. |
Total MTTx Breakdown | The breakdown of total MTTx. Allows managers to see where time is allocated and if there might be problems in their incident management process. |
MTTR Breakdown Over Time | The MTTx over time and the volatility of the MTTR. Volatility is a measurement of the spread of the values of MTTx within a given data set. If the values are all relatively consistent, volatility is low. If MTTx values are inconsistent, the average may be unaffected, but volatility is high. |
Team Productivity
Team Productivity helps you understand the impact on Team workload and performance while using BigPanda. The two KPIs used are Alert Volume (amount of work) and Operational workload (time spent).
Widget | Description |
---|---|
Team Workload - QoQ Change | The overall efficiency improvements across teams. Shows how alert volume and operational workload is changing over time. Alert volume - Displays the percent increase or decrease of alerts. Operational Workload - Measures the impact of BigPanda on overall team efficiency. Operational workload is calculated using (Incident Count * MTTR) |
Top Team KPIs | The three top level Team KPIs that indicate mature handling of incidents in BigPanda. KPIs: % Actioned Incidents - Percentage of Incidents that were handled by the team (as opposed to missed) % Resolved < 1 Hour - Percentage of Incidents that were resolved in less than an hour % Resolved in BP - Percentage of incidents that were resolved without Sharing to external targets |
of Incidents Per Resolution Bucket Over Time | The number of incidents handled for each resolution bucket, over time. Allows managers to investigate why incidents may be taking longer than expected. Resolution buckets: Still Open Under 5 min 5 - 30 min 30 - 60 min 1 - 4 hours 4 - 24 hours 1 - 7 days Over a week The line charts the overall BigPanda Workload trend over time. |
To learn more about how to best understand and leverage the data of the Executive Summary, check out the Unified Analytics Terminology guide in the BPU Unified Analytics Course.
ITSM Tools
The ITSM Tools dashboard provides metrics for your outbound integrations that receive BigPanda incidents.
Widget | Description |
---|---|
Tickets Created | The number of all outbound shares. |
Ticket Creation % | The percentage of incidents that resulted in an outbound ticket. |
Auto Tickets | The number of outbound shares triggered by AutoShare rules. |
Manual Tickets | The number of outbound shares created manually by a user in the UI. |
Ticket Compression % | The alert compression percentage for incidents that were shared. |
Shares by Tool | The percentage of outbound ticket shares sent to individual tool integrations. Selecting the Others category in the chart will expand into a new pie chart listing only tools with smaller percentages. |
Ticket Compression | The number of incidents that were shared, broken out by week. The trendline tracks the overall alert compression percentage for incidents shared each week. |
Top Tickets by Host | Tickets that have the most updates to share tools, sorted by host system. |
Shares Over Time | The number of tickets shared to each tool per week. |
Monitoring Sources
The Monitoring Sources dashboard provides detailed metrics on inbound integrations sending alert data to BigPanda.
Widget | Description |
---|---|
Monitoring Sources by Volume | The percentage of alert status changes sent by each inbound integration. |
Monitoring Sources 30 Day Trend | The number of Critical or Warning alert status changes sent by inbound integrations per week. |
Monitoring Tool Assessment | The flow of alerts from individual monitoring sources to their incident closure within BigPanda. Alerts are classified as either high, medium, or low quality based on the level of BigPanda enrichment for the alert. See Alert Quality for more information. |
Monitoring Sources by Status | The number of Critical and Warning alert status changes sent by each inbound integration. |
Self Healed Alerts Trending | The number of alerts that were resolved without operator action within BigPanda per week. The dotted line tracks the trend over the full reporting period. |
MTTX Breakdown
The MTTX Breakdown dashboard dives into trends and statistics for organizational MTTx measures.
Widget | Description |
---|---|
Mean Time to Assign (Min) | The average amount of time it took the IT Ops team to assign the incident. |
Mean Time to Engage (Min) | The average amount of time it took the IT Ops team to engage in resolving the incident. In BigPanda, this is measured by the time it takes to perform an action other than assign. |
Mean to Resolve (Min) | The average amount of time it took to get back to service. MTTR looks at the repair of alert symptoms as opposed to the complete resolution of the incident. In BigPanda, it is calculated from the start of the first alert, to the resolution of the last alert. |
MTTX breakdown over time | MTTx values per week. The trendline tracks the MTTR Volatility over time. |
MTTR by Application (minutes) | The MTTR for alerts with a Critical and Warning severity broken out by source integration. |
Raw Data
The Raw Data dashboard contains all of your reporting data in simple table format ready for exporting or in-depth investigation.
Widget | Description |
---|---|
Raw Data | Lists data for each alert, in a sortable, interactive table. Blue hyperlinks enable operators to immediately jump to incidents in BigPanda. This table can be exported to a CSV for additional reporting in an external tool. |
Teams Efficiency
The Teams Efficiency dashboard explores user actions and behavior within BigPanda.
Widget | Description |
---|---|
Incident Count | The total number of incidents belonging to this business segment that entered BigPanda during the reporting time period. |
Assigned | The number of incidents belonging to this business segment that were assigned to a user during the reporting time period. |
Commented | The number of incidents belonging to this business segment that were commented on during the reporting time period. |
Shared | The number of incidents belonging to this business segment that were auto- or manually-shared during the reporting time period. |
Manually Resolved | The number of incidents belonging to this business segment that entered BigPanda during the reporting time period. |
Merged/Split | The number of incidents belonging to this business segment that were merged or split during the reporting time period. |
% of Incidents | The percentage of incidents that entered BigPanda during the reporting time period that belonged to this business segment. |
% Assigned | The percentage of incidents belonging to this business segment that were assigned to a user during the reporting time period. |
% Commented | The percentage of incidents belonging to this business segment that were commented on during the reporting time period. |
% Shared | The percentage of incidents belonging to this business segment that were auto- or manually-shared during the reporting time period. |
% Manually Resolved | The percentage of incidents belonging to this business segment that were manually resolved during the reporting time period. |
% Merged/Split | The percentage of incidents belonging to this business segment that were merged or split during the reporting time period. |
User Stats | The number of times each user took specific actions within BigPanda toward managing incidents. |
Total Actions | The number of times each action was taken on incidents per day. |
User Actions per Day | The number of actions performed by each user per day. |
Next Steps
Learn how to Manage Unified Analytics Reports
Learn the basics of Viewing Unified Analytics
Learn how to navigate the Analytics tab
Updated 6 days ago