In early September, The Matrix Foundation homeserver went down.
I'm extremely proud of our SRE team. They had a Disaster Recovery Plan and monthly exercises to apply it, resulting in no data loss despite a 24h outage.
I've learned a lot about how to properly backup/restore a Postgres database when writing this post with SREs. We also learned how to better prevent and be resilient to human error.
Thanks all for the hugops during the outage!
matrix.org/blog/2025/10/post-m…
Post-mortem of the September 2 outage
Matrix, the open protocol for secure decentralised communicationsMatthew Hodgson (matrix.org)