Runbook

Istio Latency 99 Percentile Incident

Back to Runbooks

Overview

This incident type refers to an issue where Istio latency has exceeded the 99th percentile, indicating that the slowest 1% of requests are taking longer than 1 second to complete. This can cause performance issues and impact the user experience. It requires immediate attention and investigation to resolve the issue and prevent any further impact.

Parameters

Debug

1. Check Istio version

2. Get Istio pod status

3. Check Istio telemetry service status

4. Check Istio telemetry configuration

5. Check Istio Mixer config

6. Check Istio Mixer log

7. Check Istio gateway configuration

8. Check Istio virtual service configuration

9. Check Istio destination rule configuration

10. Check Istio metrics

11. Check Istio logs for specific pod

Resource constraints: The Istio service may be experiencing resource constraints. For example, if there are not enough resources allocated to the service, it could result in increased latency.

Repair

Scale up or down resources as necessary to ensure optimal system performance.

Learn more

Related Runbooks

Check out these related runbooks to help you debug and resolve similar issues.