new article: monitoring

might need some touch ups
added word count to article details
changed aboutme
This commit is contained in:
2005 2024-05-18 21:28:25 +02:00
parent 5a426ff5dc
commit 6aab90002a
9 changed files with 493 additions and 9 deletions

View file

@ -0,0 +1 @@
<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="icon icon-tabler icons-tabler-outline icon-tabler-letter-case-lower"><path stroke="none" d="M0 0h24v24H0z" fill="none"/><path d="M6.5 15.5m-3.5 0a3.5 3.5 0 1 0 7 0a3.5 3.5 0 1 0 -7 0" /><path d="M10 12v7" /><path d="M17.5 15.5m-3.5 0a3.5 3.5 0 1 0 7 0a3.5 3.5 0 1 0 -7 0" /><path d="M21 12v7" /></svg>

After

Width:  |  Height:  |  Size: 499 B

View file

@ -8,4 +8,3 @@ theme = "stack"
# Available values: en, fr, id, ja, ko, pt-br, zh-cn, zh-tw, es, de, nl, it, th, el, uk, ar
defaultContentLanguage = "en"
hasCJKLanguage = false

View file

@ -12,8 +12,8 @@ lastUpdated = "Jan 02, 2006 15:04 MST"
[sidebar]
subtitle = "Software developer of somekind"
musicTitle = "Something Real - Post Malone"
musicUrl = "https://www.youtube.com/watch?v=-J-x8UXqXBg&t=1"
musicTitle = "Nate Growing Up - Labrinth"
musicUrl = "https://www.youtube.com/watch?v=mgDMTorgCPg"
[sidebar.avatar]
@ -22,9 +22,10 @@ local = true
src = "img/pfp.png"
[article]
math = false
math = true
readingTime = true
[article.license]
enabled = true
default = "Licensed under CC-BY-SA 4.0"

View file

@ -10,7 +10,7 @@ menu:
icon: user
---
# 4o1x5 (2005)
# About me
Hi there. I see you somehow stumbled across my site.
@ -24,12 +24,32 @@ I do and like all kinds of things but out of many the following is worth mention
- I learn C# and PHP in school which I almost never use.
I mainly develop in Rust as I find it fitting for every purpose out there.
But I also know Java, C#, Vue, Python and some PHP
- Privacy advocate
- Privacy & open-source advocate
I created this site as I had too many processing power laying around. I decided to start blogging as a way to practice writing documentation and help the helpless out there (nixos wiki is hell).
My posts are aimed at newbies as I do not have much in-depth knowledge of what I'm writing about therefore I cannot go into much detail. Hopefully this will change in the future.
I created this site to share some of my guides with people.
I also plan on making videos in the future but finals make it impossible as of now.
**Ps: I am looking for people with the same interests. [Feel free to dm me on matrix](https://matrix.to/#/@4o1x5:4o1x5.dev)**
# Why 4o1x5?
I have no idea why I choose this nickname, I guess I wanted something unique.
It's my birth year. You can just call me 2005.
# Internet contribution
I try my best to contribute the most to the public, as I feel like sharing my knowledge with others is good for my skills even If I get no reward in the end.
I got a few project cooking right now, tho I have no idea if they'll ever be done.
## Gaming
I play a lot of games in my free time, mostly survival, adventure games. Some PvE too.
You might [catch me streaming](https://live.4o1x5.dev) on my site. You are welcome to watch, maybe even if I offer nothing interesting to watch.
Some of these include: Warframe, Minecraft, CS 1.6, Overwatch, Orcs must die 3 and so on...
# Future
I plan on working in IT, more specifically backend development.
But lately I have also been thinking of becoming a System Administrator tho I have no idea how that would turn out.
The main problem I have right now is that I don't develop in languages/frameworks that are popular inside industries. Like for example Rust, I saw a few job listings about it, all requiring at least 5 years of experience (i have none). [I'm open for job inquires at all time](https://matrix.to/#/@4o1x5:4o1x5.dev) to be honest, even besides school.
Also here is a picture of my cat
<img src="kitty.jpg" alt="tiger" width="400" />

Binary file not shown.

After

Width:  |  Height:  |  Size: 126 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 80 KiB

View file

@ -0,0 +1,394 @@
---
title: Monitor instances
description: Export metrics, collect them, visualize them.
date: 2024-05-17 07:00:00+0000
image: chris-yang-1tnS_BVy9Jk-unsplash.jpg
categories:
- Nix
- Guide
- Sysadmin
- Monitoring
tags:
- Nix
- Nginx
- Prometheus
- Exporters
- Monitoring
- Docker compose
draft: false
---
# Monitoring
Monitoring your instances allow you to keep track of servers load and its health overtime. Even looking at the stats once a day can make a huge difference as it allows you to prevent catastrophic disasters before they even happen.
I have been monitoring my servers with this method for years and I had many cases I was grateful for setting it all up.
In this small article I have included two guides to set these services up. First is with [NixOs](#nixos) and I also explain with [docker-compose](#docker-compose) but it's very sore as the main focus of this article is NixOS.
![Made with Excalidraw](graph1.png)
**Prometheus**
Prometheus is an open-source monitoring system. It helps to track, collect, and analyze
metrics from various applications and infrastructure components. It collects metrics from other software called _exporters_ that server a HTTP endpoint that return data in the prometheus data format.
Here is an example from `node-exporter`
```nix
# curl http://localhost:9100
# HELP node_cpu_seconds_total Seconds the CPUs spent in each mode.
# TYPE node_cpu_seconds_total counter
node_cpu_seconds_total{cpu="0",mode="idle"} 2.54196405e+06
node_cpu_seconds_total{cpu="0",mode="iowait"} 4213.44
node_cpu_seconds_total{cpu="0",mode="irq"} 0
node_cpu_seconds_total{cpu="0",mode="nice"} 0.06
node_cpu_seconds_total{cpu="0",mode="softirq"} 743.4
...
```
**Grafana**
Grafana is an open-source data visualization and monitoring platform. It has hundreds of features embedded that can help you query from data sources like Prometheus, InfluxDB, MySQL and so on...
For todays guide I will show you how to setup a few exporters (node-exporter, smartctl) and collect their metrics with prometheus. Then visualize that data via graphana.
## NixOs
Nix makes it trivial to set up these services, as there are already predefined options for it in nixpkgs. I will give you example configuration files below that you can just copy and paste.
I have a guide on [remote deployment](/p/remote-deployments-on-nixos/) for NixOs, below you can see an example on a folder structure you can use to deploy the services.
{{< filetree/container >}}
{{< filetree/folder name="server1" state="closed" >}}
{{< filetree/folder name="services" state="closed" >}}
{{< filetree/file name="some-service.nix" >}}
{{< filetree/folder name="monitoring" state="closed" >}}
{{< filetree/file name="prometheus.nix" >}}
{{< filetree/file name="grafana.nix" >}}
{{< filetree/folder name="exporters" state="closed" >}}
{{< filetree/file name="node.nix" >}}
{{< filetree/file name="smartctl.nix" >}}
{{< /filetree/folder >}}
{{< /filetree/folder >}}
{{< /filetree/folder >}}
{{< filetree/file name="configuration.nix" >}}
{{< filetree/file name="flake.nix" >}}
{{< filetree/file name="flake.lock" >}}
{{< /filetree/folder >}}
{{< /filetree/container >}}
### Exporters
First is node-exporter. It exports all kind of system metrics ranging from cpu usage, load average and even systemd service count.
#### Node-exporter
```nix
# /services/monitoring/exporters/node.nix
{ pkgs, ... }: {
services.prometheus.exporters.node = {
enable = true;
#port = 9001; #default is 9100
enabledCollectors = [ "systemd" ];
};
}
```
#### Smartctl
Smartctl is a tool included in the smartmontools package. This is a collection of monitoring tools for hard-drives, SSDs and filesystems.
This exporter enables you to check up on the health of your drive(s). And it will also give you a wall notification if one of your drives has a bad sector(s), which mainly suggests it's dying off.
```nix
# /services/monitoring/exporters/smartctl.nix
{ pkgs, ... }: {
# exporter
services.prometheus.exporters.smartctl = {
enable = true;
devices = [ "/dev/sda" ];
};
# for wall notifications
services.smartd = {
enable = true;
notifications.wall.enable = true;
devices = [
{
device = "/dev/sda";
}
];
};
}
```
If you happen to have other drives you can just use `lsblk` to check their paths
```bash
nix-shell -p util-linux --command lsblk
```
For example here is my pc's drives
```
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 1 0B 0 disk
nvme1n1 259:0 0 476,9G 0 disk
├─nvme1n1p1 259:1 0 512M 0 part /boot
├─nvme1n1p2 259:2 0 467,6G 0 part
│ └─luks-bbb8e429-bee1-4b5e-8ce8-c54f5f4f29a2
│ 254:0 0 467,6G 0 crypt /nix/store
│ /
└─nvme1n1p3 259:3 0 8,8G 0 part
└─luks-f7e86dde-55a5-4306-a7c2-cf2d93c9ee0b
254:1 0 8,8G 0 crypt [SWAP]
nvme0n1 259:4 0 931,5G 0 disk /mnt/data
```
### Prometheus
Now that we have setup these two exporters we need to somehow collect their metrics.
Here is a config file for prometheus, with the scrape configs already written down.
```nix
# /services/monitoring/prometheus.nix
{pkgs, config, ... }:{
services.prometheus = {
enable = true;
scrapeConfigs = [
{
job_name = "node";
scrape_interval = "5s";
static_configs = [
{
targets = [ "localhost:${toString config.services.prometheus.exporters.node.port}" ];
labels = { alias = "node.server1.local"; };
}
];
}
{
job_name = "smartctl";
scrape_interval = "5s";
static_configs = [
{
targets = [ "localhost:${toString config.services.prometheus.exporters.smartctl.port}" ];
labels = { alias = "smartctl.server1.local"; };
}
];
}
];
};
}
```
I recommend setting the 5s delay to a bigger number as you can imagine it can generate a lot of data.
~16kB average per scrape (node-exporter). 1 day has 86400 seconds, divide that by 5 thats 17280 scrapes a day.
17280 \* 16 = 276480 kB. Thats 270 megabytes a day. And if you have multiple servers that causes X times as much.
30 days of scarping is about 8 gigabytes (1x). **But remember, by default prometheus stores data for 30 days!**
### Grafana
Now let's get onto gettin' a sexy dashboard like this. First we gotta setup grafana.
![Node exporter full (id 1860)](20240518_1958.png)
```nix
# /services/monitoring/grafana.nix
{ pkgs, config, ... }:
let
grafanaPort = 3000;
in
{
services.grafana = {
enable = true;
settings.server = {
http_port = grafanaPort;
http_addr = "0.0.0.0";
};
provision = {
enable = true;
datasources.settings.datasources = [
{
name = "prometheus";
type = "prometheus";
url = "http://127.0.0.1:${toString config.services.prometheus.port}";
isDefault = true;
}
];
};
};
networking.firewall = {
allowedTCPPorts = [ grafanaPort ];
allowedUDPPorts = [ grafanaPort ];
};
}
```
If you want to access it via the internet, change the following:
- `http_addr = "127.0.0.1"`
- remove the firewall allowed ports
This insures data will only flow via the nginx reverse proxy
Remember to set `networking.domain = "example.com"` to your domain.
```nix
# /services/nginx.nix
{ pkgs, config, ... }:
let
url = "http://127.0.0.1:${toString config.services.grafana.settings.server.http_port}";
in {
services.nginx = {
enable = true;
virtualHosts = {
"grafana.${config.networking.domain}" = {
# Auto cert by let's encrypt
forceSSL = true;
enableACME = true;
locations."/" = {
proxyPass = url;
extraConfig = "proxy_set_header Host $host;";
};
locations."/api" = {
extraConfig = ''
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;
proxy_set_header Host $host;
'';
proxyPass = url;
};
};
};
};
# enable 80 and 443 ports for nginx
networking.firewall = {
enable = true;
allowedTCPPorts = [
443
80
];
allowedUDPPorts = [
443
80
];
};
}
```
### Log in
The default user is `admin` and password is `admin`. Grafana will ask you to change it upon logging-in!
### Add the dashboards
For node-exporter you can go to dashboards --> new --> import --> paste in `1860`
Now you can see all the metrics of all your server(s).
Smartctl
## Docker-compose
{{< filetree/container >}}
{{< filetree/folder name="monitoring-project" state="closed" >}}
{{< filetree/file name="docker-compose.yml" >}}
{{< filetree/file name="prometheus.nix" >}}
{{< /filetree/folder >}}
{{< /filetree/container >}}
### Compose project
I did not include a reverse proxy, neither smartctl as I forgot how to actually do it, that's how long I've been using nix :/
```yaml
# docker-compose.yml
version: "3.8"
networks:
monitoring:
driver: bridge
volumes:
prometheus_data: {}
services:
node-exporter:
image: prom/node-exporter:latest
container_name: node-exporter
restart: unless-stopped
hostname: node-exporter
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
command:
- "--path.procfs=/host/proc"
- "--path.rootfs=/rootfs"
- "--path.sysfs=/host/sys"
- "--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)"
networks:
- monitoring
prometheus:
image: prom/prometheus:latest
container_name: prometheus
restart: unless-stopped
hostname: prometheus
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
command:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus"
- "--web.console.libraries=/etc/prometheus/console_libraries"
- "--web.console.templates=/etc/prometheus/consoles"
- "--web.enable-lifecycle"
networks:
- monitoring
grafana:
image: grafana/grafana:latest
container_name: grafana
networks:
- monitoring
restart: unless-stopped
ports:
- '3000:3000'
```
```yaml
# ./prometheus.yml
global:
scrape_interval: 5s
scrape_configs:
- job_name: "node"
static_configs:
- targets: ["node-exporter:9100"]
```
```bash
docker compose up -d
```
### Setup prometheus as data source inside grafana
Head to Connections --> Data sources --> Add new data source --> Prometheus
Type in http://prometheus:9090 as the URL, on the bottom click `Save & test`.
Now you can add the dashboards, [explained in this section](#add-the-dashboards)
Photo by <a href="https://unsplash.com/@chrisyangchrisfilm?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash">Chris Yang</a> on <a href="https://unsplash.com/photos/silhouette-photography-of-man-1tnS_BVy9Jk?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash">Unsplash</a>

View file

@ -0,0 +1,69 @@
<div class="article-details">
{{ if .Params.categories }}
<header class="article-category">
{{ range (.GetTerms "categories") }}
<a href="{{ .RelPermalink }}" {{ with .Params.style }}style="background-color: {{ .background }}; color: {{ .color }};"{{ end }}>
{{ .LinkTitle }}
</a>
{{ end }}
</header>
{{ end }}
<div class="article-title-wrapper">
<h2 class="article-title">
<a href="{{ .RelPermalink }}">
{{- .Title -}}
</a>
</h2>
{{ with .Params.description }}
<h3 class="article-subtitle">
{{ . }}
</h3>
{{ end }}
</div>
{{ $showReadingTime := .Params.readingTime | default (.Site.Params.article.readingTime) }}
{{ $showDate := not .Date.IsZero }}
{{ $showFooter := or $showDate $showReadingTime }}
{{ if $showFooter }}
<footer class="article-time">
{{ if $showDate }}
<div>
{{ partial "helper/icon" "date" }}
<time class="article-time--published">
{{- .Date.Format (or .Site.Params.dateFormat.published "Jan 02, 2006") -}}
</time>
</div>
{{ end }}
{{ if $showReadingTime }}
<div>
{{ partial "helper/icon" "clock" }}
<time class="article-time--reading">
{{ T "article.readingTime" .ReadingTime }}
</time>
</div>
{{ end }}
<div>
{{ partial "helper/icon" "letter-case-lower" }}
<time class="article-time--reading">
{{ .WordCount }} words
</time>
</div>
</footer>
{{ end }}
{{ if .IsTranslated }}
<footer class="article-translations">
{{ partial "helper/icon" "language" }}
<div>
{{ range .Translations }}
<a href="{{ .Permalink }}" class="link">{{ .Language.LanguageName }}</a>
{{ end }}
</div>
</footer>
{{ end }}
</div>