new article: monitoring
might need some touch ups added word count to article details changed aboutme
This commit is contained in:
parent
5a426ff5dc
commit
6aab90002a
1
assets/icons/letter-case-lower.svg
Normal file
1
assets/icons/letter-case-lower.svg
Normal file
|
@ -0,0 +1 @@
|
|||
<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="icon icon-tabler icons-tabler-outline icon-tabler-letter-case-lower"><path stroke="none" d="M0 0h24v24H0z" fill="none"/><path d="M6.5 15.5m-3.5 0a3.5 3.5 0 1 0 7 0a3.5 3.5 0 1 0 -7 0" /><path d="M10 12v7" /><path d="M17.5 15.5m-3.5 0a3.5 3.5 0 1 0 7 0a3.5 3.5 0 1 0 -7 0" /><path d="M21 12v7" /></svg>
|
After Width: | Height: | Size: 499 B |
|
@ -8,4 +8,3 @@ theme = "stack"
|
|||
# Available values: en, fr, id, ja, ko, pt-br, zh-cn, zh-tw, es, de, nl, it, th, el, uk, ar
|
||||
defaultContentLanguage = "en"
|
||||
hasCJKLanguage = false
|
||||
|
||||
|
|
|
@ -12,8 +12,8 @@ lastUpdated = "Jan 02, 2006 15:04 MST"
|
|||
|
||||
[sidebar]
|
||||
subtitle = "Software developer of somekind"
|
||||
musicTitle = "Something Real - Post Malone"
|
||||
musicUrl = "https://www.youtube.com/watch?v=-J-x8UXqXBg&t=1"
|
||||
musicTitle = "Nate Growing Up - Labrinth"
|
||||
musicUrl = "https://www.youtube.com/watch?v=mgDMTorgCPg"
|
||||
|
||||
|
||||
[sidebar.avatar]
|
||||
|
@ -22,9 +22,10 @@ local = true
|
|||
src = "img/pfp.png"
|
||||
|
||||
[article]
|
||||
math = false
|
||||
math = true
|
||||
readingTime = true
|
||||
|
||||
|
||||
[article.license]
|
||||
enabled = true
|
||||
default = "Licensed under CC-BY-SA 4.0"
|
||||
|
|
|
@ -10,7 +10,7 @@ menu:
|
|||
icon: user
|
||||
---
|
||||
|
||||
# 4o1x5 (2005)
|
||||
# About me
|
||||
|
||||
Hi there. I see you somehow stumbled across my site.
|
||||
|
||||
|
@ -24,12 +24,32 @@ I do and like all kinds of things but out of many the following is worth mention
|
|||
- I learn C# and PHP in school which I almost never use.
|
||||
I mainly develop in Rust as I find it fitting for every purpose out there.
|
||||
But I also know Java, C#, Vue, Python and some PHP
|
||||
- Privacy advocate
|
||||
- Privacy & open-source advocate
|
||||
|
||||
I created this site as I had too many processing power laying around. I decided to start blogging as a way to practice writing documentation and help the helpless out there (nixos wiki is hell).
|
||||
My posts are aimed at newbies as I do not have much in-depth knowledge of what I'm writing about therefore I cannot go into much detail. Hopefully this will change in the future.
|
||||
I created this site to share some of my guides with people.
|
||||
I also plan on making videos in the future but finals make it impossible as of now.
|
||||
|
||||
**Ps: I am looking for people with the same interests. [Feel free to dm me on matrix](https://matrix.to/#/@4o1x5:4o1x5.dev)**
|
||||
# Why 4o1x5?
|
||||
|
||||
I have no idea why I choose this nickname, I guess I wanted something unique.
|
||||
It's my birth year. You can just call me 2005.
|
||||
|
||||
# Internet contribution
|
||||
|
||||
I try my best to contribute the most to the public, as I feel like sharing my knowledge with others is good for my skills even If I get no reward in the end.
|
||||
I got a few project cooking right now, tho I have no idea if they'll ever be done.
|
||||
|
||||
## Gaming
|
||||
|
||||
I play a lot of games in my free time, mostly survival, adventure games. Some PvE too.
|
||||
You might [catch me streaming](https://live.4o1x5.dev) on my site. You are welcome to watch, maybe even if I offer nothing interesting to watch.
|
||||
Some of these include: Warframe, Minecraft, CS 1.6, Overwatch, Orcs must die 3 and so on...
|
||||
|
||||
# Future
|
||||
|
||||
I plan on working in IT, more specifically backend development.
|
||||
But lately I have also been thinking of becoming a System Administrator tho I have no idea how that would turn out.
|
||||
The main problem I have right now is that I don't develop in languages/frameworks that are popular inside industries. Like for example Rust, I saw a few job listings about it, all requiring at least 5 years of experience (i have none). [I'm open for job inquires at all time](https://matrix.to/#/@4o1x5:4o1x5.dev) to be honest, even besides school.
|
||||
|
||||
Also here is a picture of my cat
|
||||
<img src="kitty.jpg" alt="tiger" width="400" />
|
||||
|
|
Binary file not shown.
After Width: | Height: | Size: 126 KiB |
Binary file not shown.
After Width: | Height: | Size: 4 MiB |
BIN
content/post/guides/nix/monitoring-via-prometheus/graph1.png
Normal file
BIN
content/post/guides/nix/monitoring-via-prometheus/graph1.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 80 KiB |
394
content/post/guides/nix/monitoring-via-prometheus/index.md
Normal file
394
content/post/guides/nix/monitoring-via-prometheus/index.md
Normal file
|
@ -0,0 +1,394 @@
|
|||
---
|
||||
title: Monitor instances
|
||||
description: Export metrics, collect them, visualize them.
|
||||
date: 2024-05-17 07:00:00+0000
|
||||
image: chris-yang-1tnS_BVy9Jk-unsplash.jpg
|
||||
categories:
|
||||
- Nix
|
||||
- Guide
|
||||
- Sysadmin
|
||||
- Monitoring
|
||||
|
||||
tags:
|
||||
- Nix
|
||||
- Nginx
|
||||
- Prometheus
|
||||
- Exporters
|
||||
- Monitoring
|
||||
- Docker compose
|
||||
draft: false
|
||||
---
|
||||
|
||||
# Monitoring
|
||||
|
||||
Monitoring your instances allow you to keep track of servers load and its health overtime. Even looking at the stats once a day can make a huge difference as it allows you to prevent catastrophic disasters before they even happen.
|
||||
I have been monitoring my servers with this method for years and I had many cases I was grateful for setting it all up.
|
||||
In this small article I have included two guides to set these services up. First is with [NixOs](#nixos) and I also explain with [docker-compose](#docker-compose) but it's very sore as the main focus of this article is NixOS.
|
||||
|
||||
![Made with Excalidraw](graph1.png)
|
||||
|
||||
**Prometheus**
|
||||
Prometheus is an open-source monitoring system. It helps to track, collect, and analyze
|
||||
metrics from various applications and infrastructure components. It collects metrics from other software called _exporters_ that server a HTTP endpoint that return data in the prometheus data format.
|
||||
Here is an example from `node-exporter`
|
||||
|
||||
```nix
|
||||
# curl http://localhost:9100
|
||||
|
||||
# HELP node_cpu_seconds_total Seconds the CPUs spent in each mode.
|
||||
# TYPE node_cpu_seconds_total counter
|
||||
node_cpu_seconds_total{cpu="0",mode="idle"} 2.54196405e+06
|
||||
node_cpu_seconds_total{cpu="0",mode="iowait"} 4213.44
|
||||
node_cpu_seconds_total{cpu="0",mode="irq"} 0
|
||||
node_cpu_seconds_total{cpu="0",mode="nice"} 0.06
|
||||
node_cpu_seconds_total{cpu="0",mode="softirq"} 743.4
|
||||
...
|
||||
```
|
||||
|
||||
**Grafana**
|
||||
Grafana is an open-source data visualization and monitoring platform. It has hundreds of features embedded that can help you query from data sources like Prometheus, InfluxDB, MySQL and so on...
|
||||
|
||||
For todays guide I will show you how to setup a few exporters (node-exporter, smartctl) and collect their metrics with prometheus. Then visualize that data via graphana.
|
||||
|
||||
## NixOs
|
||||
|
||||
Nix makes it trivial to set up these services, as there are already predefined options for it in nixpkgs. I will give you example configuration files below that you can just copy and paste.
|
||||
|
||||
I have a guide on [remote deployment](/p/remote-deployments-on-nixos/) for NixOs, below you can see an example on a folder structure you can use to deploy the services.
|
||||
{{< filetree/container >}}
|
||||
|
||||
{{< filetree/folder name="server1" state="closed" >}}
|
||||
|
||||
{{< filetree/folder name="services" state="closed" >}}
|
||||
{{< filetree/file name="some-service.nix" >}}
|
||||
{{< filetree/folder name="monitoring" state="closed" >}}
|
||||
{{< filetree/file name="prometheus.nix" >}}
|
||||
{{< filetree/file name="grafana.nix" >}}
|
||||
{{< filetree/folder name="exporters" state="closed" >}}
|
||||
{{< filetree/file name="node.nix" >}}
|
||||
{{< filetree/file name="smartctl.nix" >}}
|
||||
{{< /filetree/folder >}}
|
||||
{{< /filetree/folder >}}
|
||||
{{< /filetree/folder >}}
|
||||
|
||||
{{< filetree/file name="configuration.nix" >}}
|
||||
{{< filetree/file name="flake.nix" >}}
|
||||
{{< filetree/file name="flake.lock" >}}
|
||||
|
||||
{{< /filetree/folder >}}
|
||||
|
||||
{{< /filetree/container >}}
|
||||
|
||||
### Exporters
|
||||
|
||||
First is node-exporter. It exports all kind of system metrics ranging from cpu usage, load average and even systemd service count.
|
||||
|
||||
#### Node-exporter
|
||||
|
||||
```nix
|
||||
# /services/monitoring/exporters/node.nix
|
||||
{ pkgs, ... }: {
|
||||
services.prometheus.exporters.node = {
|
||||
enable = true;
|
||||
#port = 9001; #default is 9100
|
||||
enabledCollectors = [ "systemd" ];
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
#### Smartctl
|
||||
|
||||
Smartctl is a tool included in the smartmontools package. This is a collection of monitoring tools for hard-drives, SSDs and filesystems.
|
||||
This exporter enables you to check up on the health of your drive(s). And it will also give you a wall notification if one of your drives has a bad sector(s), which mainly suggests it's dying off.
|
||||
|
||||
```nix
|
||||
# /services/monitoring/exporters/smartctl.nix
|
||||
{ pkgs, ... }: {
|
||||
# exporter
|
||||
services.prometheus.exporters.smartctl = {
|
||||
enable = true;
|
||||
devices = [ "/dev/sda" ];
|
||||
};
|
||||
# for wall notifications
|
||||
services.smartd = {
|
||||
enable = true;
|
||||
notifications.wall.enable = true;
|
||||
devices = [
|
||||
{
|
||||
device = "/dev/sda";
|
||||
}
|
||||
];
|
||||
};
|
||||
|
||||
}
|
||||
```
|
||||
|
||||
If you happen to have other drives you can just use `lsblk` to check their paths
|
||||
|
||||
```bash
|
||||
nix-shell -p util-linux --command lsblk
|
||||
```
|
||||
|
||||
For example here is my pc's drives
|
||||
|
||||
```
|
||||
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
|
||||
sda 8:0 1 0B 0 disk
|
||||
nvme1n1 259:0 0 476,9G 0 disk
|
||||
├─nvme1n1p1 259:1 0 512M 0 part /boot
|
||||
├─nvme1n1p2 259:2 0 467,6G 0 part
|
||||
│ └─luks-bbb8e429-bee1-4b5e-8ce8-c54f5f4f29a2
|
||||
│ 254:0 0 467,6G 0 crypt /nix/store
|
||||
│ /
|
||||
└─nvme1n1p3 259:3 0 8,8G 0 part
|
||||
└─luks-f7e86dde-55a5-4306-a7c2-cf2d93c9ee0b
|
||||
254:1 0 8,8G 0 crypt [SWAP]
|
||||
nvme0n1 259:4 0 931,5G 0 disk /mnt/data
|
||||
```
|
||||
|
||||
### Prometheus
|
||||
|
||||
Now that we have setup these two exporters we need to somehow collect their metrics.
|
||||
Here is a config file for prometheus, with the scrape configs already written down.
|
||||
|
||||
```nix
|
||||
# /services/monitoring/prometheus.nix
|
||||
{pkgs, config, ... }:{
|
||||
|
||||
services.prometheus = {
|
||||
enable = true;
|
||||
|
||||
scrapeConfigs = [
|
||||
{
|
||||
job_name = "node";
|
||||
scrape_interval = "5s";
|
||||
static_configs = [
|
||||
{
|
||||
targets = [ "localhost:${toString config.services.prometheus.exporters.node.port}" ];
|
||||
labels = { alias = "node.server1.local"; };
|
||||
}
|
||||
];
|
||||
}
|
||||
{
|
||||
job_name = "smartctl";
|
||||
scrape_interval = "5s";
|
||||
static_configs = [
|
||||
{
|
||||
targets = [ "localhost:${toString config.services.prometheus.exporters.smartctl.port}" ];
|
||||
labels = { alias = "smartctl.server1.local"; };
|
||||
}
|
||||
];
|
||||
}
|
||||
];
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
I recommend setting the 5s delay to a bigger number as you can imagine it can generate a lot of data.
|
||||
~16kB average per scrape (node-exporter). 1 day has 86400 seconds, divide that by 5 thats 17280 scrapes a day.
|
||||
17280 \* 16 = 276480 kB. Thats 270 megabytes a day. And if you have multiple servers that causes X times as much.
|
||||
30 days of scarping is about 8 gigabytes (1x). **But remember, by default prometheus stores data for 30 days!**
|
||||
|
||||
### Grafana
|
||||
|
||||
Now let's get onto gettin' a sexy dashboard like this. First we gotta setup grafana.
|
||||
|
||||
![Node exporter full (id 1860)](20240518_1958.png)
|
||||
|
||||
```nix
|
||||
# /services/monitoring/grafana.nix
|
||||
{ pkgs, config, ... }:
|
||||
let
|
||||
grafanaPort = 3000;
|
||||
in
|
||||
{
|
||||
services.grafana = {
|
||||
enable = true;
|
||||
settings.server = {
|
||||
http_port = grafanaPort;
|
||||
http_addr = "0.0.0.0";
|
||||
};
|
||||
provision = {
|
||||
enable = true;
|
||||
datasources.settings.datasources = [
|
||||
{
|
||||
name = "prometheus";
|
||||
type = "prometheus";
|
||||
url = "http://127.0.0.1:${toString config.services.prometheus.port}";
|
||||
isDefault = true;
|
||||
}
|
||||
];
|
||||
};
|
||||
};
|
||||
|
||||
networking.firewall = {
|
||||
allowedTCPPorts = [ grafanaPort ];
|
||||
allowedUDPPorts = [ grafanaPort ];
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
If you want to access it via the internet, change the following:
|
||||
|
||||
- `http_addr = "127.0.0.1"`
|
||||
- remove the firewall allowed ports
|
||||
|
||||
This insures data will only flow via the nginx reverse proxy
|
||||
|
||||
Remember to set `networking.domain = "example.com"` to your domain.
|
||||
|
||||
```nix
|
||||
# /services/nginx.nix
|
||||
{ pkgs, config, ... }:
|
||||
let
|
||||
url = "http://127.0.0.1:${toString config.services.grafana.settings.server.http_port}";
|
||||
in {
|
||||
services.nginx = {
|
||||
enable = true;
|
||||
|
||||
virtualHosts = {
|
||||
"grafana.${config.networking.domain}" = {
|
||||
# Auto cert by let's encrypt
|
||||
forceSSL = true;
|
||||
enableACME = true;
|
||||
|
||||
locations."/" = {
|
||||
proxyPass = url;
|
||||
extraConfig = "proxy_set_header Host $host;";
|
||||
};
|
||||
|
||||
locations."/api" = {
|
||||
extraConfig = ''
|
||||
proxy_http_version 1.1;
|
||||
proxy_set_header Upgrade $http_upgrade;
|
||||
proxy_set_header Connection $connection_upgrade;
|
||||
proxy_set_header Host $host;
|
||||
'';
|
||||
proxyPass = url;
|
||||
};
|
||||
};
|
||||
};
|
||||
};
|
||||
|
||||
# enable 80 and 443 ports for nginx
|
||||
networking.firewall = {
|
||||
enable = true;
|
||||
allowedTCPPorts = [
|
||||
443
|
||||
80
|
||||
];
|
||||
allowedUDPPorts = [
|
||||
443
|
||||
80
|
||||
];
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
### Log in
|
||||
|
||||
The default user is `admin` and password is `admin`. Grafana will ask you to change it upon logging-in!
|
||||
|
||||
### Add the dashboards
|
||||
|
||||
For node-exporter you can go to dashboards --> new --> import --> paste in `1860`
|
||||
Now you can see all the metrics of all your server(s).
|
||||
|
||||
Smartctl
|
||||
|
||||
## Docker-compose
|
||||
|
||||
{{< filetree/container >}}
|
||||
|
||||
{{< filetree/folder name="monitoring-project" state="closed" >}}
|
||||
|
||||
{{< filetree/file name="docker-compose.yml" >}}
|
||||
{{< filetree/file name="prometheus.nix" >}}
|
||||
|
||||
{{< /filetree/folder >}}
|
||||
|
||||
{{< /filetree/container >}}
|
||||
|
||||
### Compose project
|
||||
|
||||
I did not include a reverse proxy, neither smartctl as I forgot how to actually do it, that's how long I've been using nix :/
|
||||
|
||||
```yaml
|
||||
# docker-compose.yml
|
||||
version: "3.8"
|
||||
|
||||
networks:
|
||||
monitoring:
|
||||
driver: bridge
|
||||
|
||||
volumes:
|
||||
prometheus_data: {}
|
||||
|
||||
services:
|
||||
node-exporter:
|
||||
image: prom/node-exporter:latest
|
||||
container_name: node-exporter
|
||||
restart: unless-stopped
|
||||
hostname: node-exporter
|
||||
volumes:
|
||||
- /proc:/host/proc:ro
|
||||
- /sys:/host/sys:ro
|
||||
- /:/rootfs:ro
|
||||
command:
|
||||
- "--path.procfs=/host/proc"
|
||||
- "--path.rootfs=/rootfs"
|
||||
- "--path.sysfs=/host/sys"
|
||||
- "--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)"
|
||||
networks:
|
||||
- monitoring
|
||||
|
||||
prometheus:
|
||||
image: prom/prometheus:latest
|
||||
container_name: prometheus
|
||||
restart: unless-stopped
|
||||
hostname: prometheus
|
||||
volumes:
|
||||
- ./prometheus.yml:/etc/prometheus/prometheus.yml
|
||||
- prometheus_data:/prometheus
|
||||
command:
|
||||
- "--config.file=/etc/prometheus/prometheus.yml"
|
||||
- "--storage.tsdb.path=/prometheus"
|
||||
- "--web.console.libraries=/etc/prometheus/console_libraries"
|
||||
- "--web.console.templates=/etc/prometheus/consoles"
|
||||
- "--web.enable-lifecycle"
|
||||
networks:
|
||||
- monitoring
|
||||
|
||||
grafana:
|
||||
image: grafana/grafana:latest
|
||||
container_name: grafana
|
||||
networks:
|
||||
- monitoring
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- '3000:3000'
|
||||
```
|
||||
|
||||
```yaml
|
||||
# ./prometheus.yml
|
||||
global:
|
||||
scrape_interval: 5s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: "node"
|
||||
static_configs:
|
||||
- targets: ["node-exporter:9100"]
|
||||
```
|
||||
|
||||
```bash
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
### Setup prometheus as data source inside grafana
|
||||
|
||||
Head to Connections --> Data sources --> Add new data source --> Prometheus
|
||||
Type in http://prometheus:9090 as the URL, on the bottom click `Save & test`.
|
||||
|
||||
Now you can add the dashboards, [explained in this section](#add-the-dashboards)
|
||||
|
||||
Photo by <a href="https://unsplash.com/@chrisyangchrisfilm?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash">Chris Yang</a> on <a href="https://unsplash.com/photos/silhouette-photography-of-man-1tnS_BVy9Jk?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash">Unsplash</a>
|
69
layouts/partials/article/components/details.html
Normal file
69
layouts/partials/article/components/details.html
Normal file
|
@ -0,0 +1,69 @@
|
|||
<div class="article-details">
|
||||
{{ if .Params.categories }}
|
||||
<header class="article-category">
|
||||
{{ range (.GetTerms "categories") }}
|
||||
<a href="{{ .RelPermalink }}" {{ with .Params.style }}style="background-color: {{ .background }}; color: {{ .color }};"{{ end }}>
|
||||
{{ .LinkTitle }}
|
||||
</a>
|
||||
{{ end }}
|
||||
</header>
|
||||
{{ end }}
|
||||
|
||||
<div class="article-title-wrapper">
|
||||
<h2 class="article-title">
|
||||
<a href="{{ .RelPermalink }}">
|
||||
{{- .Title -}}
|
||||
</a>
|
||||
</h2>
|
||||
|
||||
{{ with .Params.description }}
|
||||
<h3 class="article-subtitle">
|
||||
{{ . }}
|
||||
</h3>
|
||||
{{ end }}
|
||||
</div>
|
||||
|
||||
{{ $showReadingTime := .Params.readingTime | default (.Site.Params.article.readingTime) }}
|
||||
{{ $showDate := not .Date.IsZero }}
|
||||
{{ $showFooter := or $showDate $showReadingTime }}
|
||||
{{ if $showFooter }}
|
||||
|
||||
<footer class="article-time">
|
||||
{{ if $showDate }}
|
||||
<div>
|
||||
{{ partial "helper/icon" "date" }}
|
||||
<time class="article-time--published">
|
||||
{{- .Date.Format (or .Site.Params.dateFormat.published "Jan 02, 2006") -}}
|
||||
</time>
|
||||
</div>
|
||||
{{ end }}
|
||||
|
||||
{{ if $showReadingTime }}
|
||||
<div>
|
||||
{{ partial "helper/icon" "clock" }}
|
||||
<time class="article-time--reading">
|
||||
{{ T "article.readingTime" .ReadingTime }}
|
||||
</time>
|
||||
</div>
|
||||
{{ end }}
|
||||
|
||||
<div>
|
||||
{{ partial "helper/icon" "letter-case-lower" }}
|
||||
<time class="article-time--reading">
|
||||
{{ .WordCount }} words
|
||||
</time>
|
||||
</div>
|
||||
</footer>
|
||||
{{ end }}
|
||||
|
||||
{{ if .IsTranslated }}
|
||||
<footer class="article-translations">
|
||||
{{ partial "helper/icon" "language" }}
|
||||
<div>
|
||||
{{ range .Translations }}
|
||||
<a href="{{ .Permalink }}" class="link">{{ .Language.LanguageName }}</a>
|
||||
{{ end }}
|
||||
</div>
|
||||
</footer>
|
||||
{{ end }}
|
||||
</div>
|
Loading…
Reference in a new issue