Stop-the-world garbage collection causes everything to freeze and is an issue for any company that runs JVM at scale. If your on-call team doesn’t get there at the moment the problem occurs, the issue is gone, and they have to keep guessing as to what happened. They go from machine to machine hoping to observe the issue again, wasting valuable time and resources. And while tools are available to debug the JVM, they require tremendous effort to set up, and rarely solve the problem because they need to connect at just the right time to witness the problem..
Searching to find transient issues shouldn’t take weeks or require a sprint. Shoreline solves this by:
Identifying the problem by monitoring all of your JVMs, capturing per second data across a few JVMS or thousands.
When a memory or performance limit is reached, capturing the garbage collection, heap, thread, and deadlock data in an S3 bucket so the engineering team can find and permanently fix the root cause.
Optionally restarting the JVM or taking another remediating action such as scaling up or down.
across all JVMs
Captures debug data
Optionally restarts JVM
or takes another remediating action