Search

Items tagged with: SRE


The best way to test the resilience of your service is: reboot the server. I mean it. Don't just restart the service, reboot the server. A story in two parts:

Part 1: a while ago I configured ZFS under my Talos Kubernetes cluster, and everything was fine, until I decided to reboot. When it came back, nothing was working, because I forgot to properly configure a way for Talos to read the ZFS volume encryption key.

Part 2: at some point I configured Audiobookshelf to store data on top of said ZFS volume. Everything was working fine for a few weeks, until I had to reboot the server again (for reasons). When it came back, I lost all my downloaded podcasts, because I had a typo on my configuration that was pointing to a directory outside of the PVC, so it mounted as an emptyDir volume.

Honestly, I should have known better. I had issues in the past when some servers went down because of power failures (battery didn't last) and they did not come back properly.

You gotta do a reboot/power test every once in a while, just like you have to test your backups on a regular basis.

#HomeLab #TalosLinux #ZFS #SRE #DevOps @homelab


In early September, The Matrix Foundation homeserver went down.

I'm extremely proud of our SRE team. They had a Disaster Recovery Plan and monthly exercises to apply it, resulting in no data loss despite a 24h outage.

I've learned a lot about how to properly backup/restore a Postgres database when writing this post with SREs. We also learned how to better prevent and be resilient to human error.

Thanks all for the hugops during the outage!

matrix.org/blog/2025/10/post-m…

#homelab #selfHosting #sre


Do you like security? Do you like privacy? Cryptography? Do you like working for a public benefit non-profit instead of an investor-beholden corporation?

Let's Encrypt is hiring for someone to join our SRE team and help run the largest Certificate Authority in the world! Come work with me and some of the most wonderful folks in tech, to make the web a better place.

abetterinternet.org/careers/le…

#jobs #sre #webPKI #security #privacy #cryptography

⇧