A collection templates ported from the SRE Workbook
-
Updated
Aug 24, 2018
A collection templates ported from the SRE Workbook
A party card game for engineers caring about reliability. Based on Cards Against Humanity.
Overall map of topics to cover for my “Engineering for Site Reliability” blog series.
Gerd by Onyx is a light-weight chaos monkey implementation for k8s (kubernetes)
Calculate how much downtime should be permitted in your Service Level Agreement or Objective
A list of common Disaster Recovery (DR) scenarios for software companies
An ongoing & curated collection of awesome SRE software and tools, libraries and frameworks, engineering books and blogs, philosophical principles, technical guidelines, practical tools about the field of Site Reliablity Engineering (SRE)
🔖 Daily-updated reading list for designing High Scalability 🍒, High Availability 🔥, High Stability 🗻 back-end systems - Pull requests are greatly welcome 👬 I hope you will find this project helpful 🍀 Please help me share it to more and more people ❤️ Thank you - 谢谢 - धन्यवाद - ধন্যবাদ - Спасибо - شكرا - Merci - Gracias - Danke - Cảm ơn! 🙇
A .Net Standard library for working with the Uptime Robot API.
A curated list of awesome Site Reliability and Production Engineering resources.
A collection of postmortem templates
A curated list of Site Reliability and Production Engineering resources.
A role-playing game for incident management training
Add a description, image, and links to the site-reliability topic page so that developers can more easily learn about it.
To associate your repository with the site-reliability topic, visit your repo's landing page and select "manage topics."