This project deploys, using vagrant, servers with Prometheus, Telegraf, M3DB, Grafana, and some more components.
This can be used to test configurations for both Prometheus and Grafana.
Prometheus is configured to scrape metrics from telegraf, node_exporter, grafana and m3db. The configuration also has the remote read/write option configured for m3db.
You can run stress-ng
to test the prometheus alerts.
Add environment variable SLACK_API_URL
with your token url for slack notifications.
- Linux, Windows or macOS host with at least 16GB RAM (depends on the project)
- VirtualBox - https://www.virtualbox.org/
- Vagrant - https://www.vagrantup.com/
To create the servers execute the following commands:
vagrant plugin install vagrant-hostmanager
vagrant up
On Windows take a look at:
The provisioning clones the prometheus repository with some configuration that can be used.
This repository is under /home/vagrant/projects/prometheus-configuration
.
Change to this directory and change env variables PROM_SERVER_ADDR and PROM_SERVER_PORT
on the file grafana/config.grafana
to:
- PROM_SERVER_ADDR: mon-1
- PROM_SERVER_PORT: 9090
vagrant ssh mon-1
cd projects/prometheus-configuration
# if you want to send notifications to a slack channel change the file
# alertmanager/alertmanager.yml
docker-compose up -d
If you want to deploy everything with High Availability you just need to change some files.
When creating both mon-1 and mon-2 servers the provisioning executes git clone
of the prometheus config repo
under /home/vagrant/projects/prometheus-configuration
.
In order to have HA change the following files (relative to repo dir):
- docker-compose.yml
- prometheus/prometheus.yml
- karma/karma.yml
- grafana/config.grafana
- alertmanager/alertmanager.yml
You need to change this files on both mon-1 and mon-2.
Under alertmanager
section change the line:
# - '--cluster.peer=alertmanager2:9094'
to (for mon-1)
- '--cluster.peer=mon-2:9094'
and (for mon-2)
- '--cluster.peer=mon-1:9094'
Uncomment the line (on both mon-1 and mon-2):
# - 9094:9094
Change the alerting section from:
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanager:9093
# - alertmanager2:9093
to (on both mon-1 and mon-2):
alerting:
alertmanagers:
- static_configs:
- targets:
- mon-1:9093
- mon-2:9093
Uncomment remote read/write block and change the server At the end the block should look like this:
# remote read/write to/from M3DB
remote_read:
- url: 'http://mon-lts:7201/api/v1/prom/remote/read'
# to test reading even when local Prometheus has the data
read_recent: true
remote_write:
- url: 'http://mon-lts:7201/api/v1/prom/remote/write'
Change the targets on every job to scrape from both mon-1 and mon-2, except for the m3. The m3 job should be uncommented and change the m3db-server to mon-lts.
Change the file from (on both mon-1 and mon-2):
alertmanager:
interval: 30s
servers:
- name: alertmanager1
uri: http://alertmanager:9093
timeout: 20s
proxy: true
# - name: alertmanager2
# uri: http://alertmanager2:9093
# timeout: 20s
# proxy: true
to:
alertmanager:
interval: 30s
servers:
- name: alertmanager1
uri: http://mon-1:9093
timeout: 20s
proxy: true
- name: alertmanager2
uri: http://mon-2:9093
timeout: 20s
proxy: true
Karma can be accessed on both mon-1 and mon-2 so you can use LB to manage HA.
You can configure Grafan two ways:
- each grafana configured with datasource to the prometheus running on the same server;
- each grafana configured with datasource to the M3DB LTS;
For the first option change PROM_SERVER_ADDR to mon-1 or mon-2 respectively. For the second change the variables PROM_SERVER_ADDR and PROM_SERVER_PORT to mon-lts and 7201 respectively.
Change the configuration to send notifications to Slack (or something else).
- Grafana Provisioning
- Karma authentication
- Alerta Metrics and authentication
- M3DB retention period