#FreeBSD recommendations for #monitoring #alerting #observability sought. I have a much loved collectd + riemann that needs an upgrade.

Target is about 10 servers and 200 jails.

No apache2 /php, nagios or clones thereof please. I don’t have these in my stack today, and my expertise in managing them is about 20 years out of date. I prefer to avoid JVM stuff but I’m not violently against it.

Doesn’t have to be in ports yet ( like sensu.io/ server) if it’s in a friendly language.

in reply to feld

@feld @maikel is a huge fan of sensu. We don’t have the server in FreeBSD ports but I poked it a bit about 18 months ago and it seems feasible to port it.

I also tried openobserve, port again seems doable but I get the feeling they are experimenting with licensing and business models still. I really want a simple store of logs that doesn’t have elastic/opensearch/mongo at the core.

in reply to feld

@feld @maikel I recently started using #VictoriaLogs and have been impressed, at least compared to my previous usage of a central `syslogd` and `grep`. Needed a syslogd_flags containing `-O rfc5424` on BSD systems to prevent it mixing host and app names. A colleague installed Prometheus/Grafana some years back and while it produced pretty and entertaining graphs, none of it seemed actually useful and I've never seen decent use case examples for such metrics.
in reply to okapi

It's not just about metrics themselves. You can just centralize all your stuff here. Most people use multiple systems: one for metrics (cpu % perhaps), and another for monitoring processes etc (is nginx running)

But you can just leverage a time series database to do both. You can use the systemd exporter for example to get the services into the time series database, then setup alert rules for those. Process is running if it has a value of 1, stopped or absent if value is 0 or null

You get a lot of power if you apply a little creativity

in reply to Psyhackological

well, VictoriaMetrics supports:

- prometheus
- influxdb
- opentsdb
- graphite
- (maybe datadog? can't remember)
- their own API
- all of PromQL, plus some improvements to it that allow you to do certain queries that are literally impossible in normal PromQL

The client/agent side is a 7.5MB binary right now vs 100MB+ for Prometheus

It's faster, uses fewer resources, and has a far more efficient storage implementation (like 10:1 compression)

and it's basically a complete drop-in replacement. It supports almost all of the prometheus.yaml config file. There are a few things you may have to remove, but all your target configuration etc will still work as-is so you don't have to re-engineer anything

It's good. Someone sat down and said "we can do this better with less bloat" and they did it. And I hope they become filthy rich for it.

in reply to feld

@feld so to sum it feels like they saw Prometheus did it themselved and now it looks like a worse version of it.

I use at my job Prometheus, AlertManager and Grafana stack and I'm wondering if this can be done better and also at the same time if I understand correctly it also comes with more easily accessible logs? Sorry I'm trying to grasp it and I see you have experience with Victoria things.

@feld
in reply to Psyhackological

> looks like a worse version of it.

I'm not sure what you mean by this. Are you referring to my screenshot? That's just the basic web interface to Victoria Logs. There's also one for Victoria Metrics. But I use Grafana with Victoria. You just add it as a Prometheus data source and it works.

I have never used AlertManager, just the alerts+webhooks directly in Grafana. But Victoria does have their own AlertManager equivalent.

Also, I have hated every implementation of "Logs" in Grafana. Loki is terrible and the Grafana support is bad IMHO. But I *think* you can also use VictoriaLogs with all of those things in case you want alerts based on certain log events