Anurag Gupta
December 9, 2021 | Anurag Gupta

Webinar: 5 Technical Lessons Learned from Outages at AWS, Google and Microsoft

We can learn something from every outage, regardless of whether it comes from a start-up or a hyperscaler like Amazon or Google.  In this webinar, we hear from two reliability experts, Niall Murphy (former head of SRE at Microsoft and Google) and Anurag Gupta (former VP of AWS database and analytic services).  This session includes technical lessons learned large outages at Amazon and Google, including:

  • The importance of automation and how to build circuit breakers to mitigate risk
  • Scaling automation: what works at 1,000 customers may not work at 1 million customers.
  • Botched rollouts:  don’t forget to check the failure rate of distributed jobs
  • The anti-80/20 rule.  Avoiding your biggest potential Achilles heel by building redundancy into the systems you use the least.
  • How to conduct blameless post-mortems that maximize the lessons learned from any outage

Ready to get started?

Shoreline helps you eliminate repetitive tickets and increase your availability at the same time. Get started today with a free trial.