A curated list of Site Reliability and Production Engineering resources.
-
Updated
Jun 10, 2024
A curated list of Site Reliability and Production Engineering resources.
A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)
A collection of postmortems. Sorry for the delay in merging PRs!
A collection of postmortem templates
Compilation of public failure/horror stories related to Kubernetes
A curated list of Site Reliability and Production Engineering Tools
Postmortem debugging tools for MinGW.
An Incident Management Process / Post Mortem Template
A curated list of awesome Site Reliability and Production Engineering resources.
Analysis of the major exploits that took place on the Ethereum blockchain
Selection of Development Templates
💀 🔥 ❄️ A basic analyzer for memory dumps containing managed code
Compilation of public failure/horror stories related to Kubernetes
Compilation of public incident/interesting/horror stories related to Kafka operations
Shell_basics
Perform post-mortem Linux baselining and forensic analysis.
How to run effective incident post-morterms
Add a description, image, and links to the post-mortem topic page so that developers can more easily learn about it.
To associate your repository with the post-mortem topic, visit your repo's landing page and select "manage topics."