Looking for a way to monitor my services

Shimitar@feddit.it · 7 months ago

Looking for a way to monitor my services

Avid Amoeba@lemmy.ca · edit-2 7 months ago

I think Prometheus is a good industry standard. It can do everything you listed except for restarting stuff. It’s got a decent built-in monitoring capability and you can extend it trivially to monitor anything. For example I wrote a 5-liner to monitor ZFS health and another for LVM. I even monitor my routers with it. OpenWrt has an installable node exporter for Prometheus.

Service restarting is a remote execution capability and generally falls outside of the monitoring domain. You’d be better off implementing that with another process/service manager. If you’re running systemd, that’s one of its primary purposes. You can use it to start/stop/restart containers just like normal processes.

krash@lemmy.ml · 7 months ago

Can you share a guide / tutorial on how to accomplish what OP wants (or just get started with Prometheus)? I was in the same boat as OP and settled for netdata, and eventually gave up on monitoring altogether because it was either overwhelming me with data, too cumbersome to set up or had features behind paid plans.

Avid Amoeba@lemmy.ca · edit-2 7 months ago

Anytime you’re asking this, go for the projects Quick Start / Getting Started doc. In this case here. If you’re on a Debian based system Prometheus is already packaged in the repository so you don’t have to download the latest. You likely won’t win anything but the pain for having to set up the bare binary as a service with systemd. I followed that doc to setup mine but installed it from apt.

On a second thought, if you’re getting it from the repo and it already has a systemd unit defined, it might be more difficult to follow the Getting Started doc. You know what, follow it as-is. Once you have something running and monitoring ad-hoc, it’ll be easy to install from apt and put your config in it.

zinderic@programming.dev · 7 months ago

Give https://github.com/louislam/uptime-kuma a try. I’m planning to do the same for similar use case. Sensu (sensu.io) is a more sophisticated option but it requires more infrastructure and there is a bit of a learning curve with it.

cron@feddit.de · 7 months ago

While I really like uptime kuma, it seems a bit too restricted for OPs use case. For example, to monitor disk or CPU usage, you would need to write your own scripts. It would be doable, but not very nice.

At least how I understood the.question, OP would probably look for something like icinga.

zinderic@programming.dev · 7 months ago

Yeah better fit but a bit of trouble to setup… What’s your opinion on Icinga? Never used it myself.

cron@feddit.de · 7 months ago

We had it at work, but I never did anything else than receiving and resolving alerts. But it looked good for me and I liked the system.

poVoq@slrpnk.net · 7 months ago

Have you tried Cockpit? It has pretty nice Podman integration.

umbrella@lemmy.ml · edit-2 7 months ago

grafana is pretty annoying to learn and setup but it does everything you seem to want.

CyberTailor@lemmy.world · 7 months ago

good ol’ nagios (or one of its forks)

7 months ago

You can do most of not all of this with CheckMk but it’s probably overkill.

Majestix@lemmy.world · 7 months ago

I use Homarr and Enjoy it a lot. Nice Interface, can Monitor not only Services but also the Server itself and is quite Customizable.

rambos@lemm.ee · 7 months ago

If you go with dashboard approach, I would suggest Homepage

Majestix@lemmy.world · 7 months ago

I like Homarr because it is Drag and Drop to edit the Page. I think with Homepage i would have to edit the config file.

LifeBandit666@feddit.uk · edit-2 7 months ago

I tried to spin up a Homarr docker container the other day after seeing it on YouTube, but because it’s located in ghcr it just wouldn’t install.

I even added ghcr to my resources in docker using my password and an API key, but still no dice.

I’m missing something obvious, but I’m not sure what, any pointers?

Edit: I’ve just tried again and this time it hasn’t failed with an error message, just hanging in Portainer stacks deployment instead

Edit 2: I left it hanging and checked while I was out and about (love Tailscale)and it’s working now!

TCB13@lemmy.world · 7 months ago

deleted by creator

atzanteol@sh.itjust.works · 7 months ago

i have a mixed set of containers (a few, not too many) and bare-metal services

Containers run on bare metal. Or are you running them in a vm?

Shimitar@feddit.it · 7 months ago

I run containers on bare metal indeed.

I have services running in containers on bare metal and services running without containers, on bare metal.