High – potentially hours of downtime
Weekly for fleets with hundreds of nodes
~ 1-6 hours
There can be a number of network related issues that are very hard to diagnose because they don’t occur consistently across the entire network. In many situations, basic checks of the fleet will make it look like 99% of the fleet is performing normally and that there is just some mild variability in network connectivity. In reality, there may be a small number of nodes that can no longer connect to the network.
This could lead to a very bad experience for a small number of customers. These types of incidents can often be hard to diagnose because they are literally like searching for a needle in a haystack.
The larger the fleet, the more likely companies are to experience this type of incident.
Typically, Shoreline does not trigger an automated repair for this type of incident. Instead, Shoreline provides a series of diagnostics that help on-call teams more quickly pin-point the specific network issue and nodes affected by the issue. These diagnostics eliminate hours of wasted time that operators would otherwise spend trying to manually uncover the issue.
Here are the diagnostics run by Shoreline: