Telegraf Role¶
Overview¶
Telegraf is a plugin-driven server agent designed for collecting and sending metrics. It supports a wide variety of input plugins for system metrics, application metrics, and custom data sources, making it a versatile tool for monitoring. Its primary purpose in the solti-monitoring collection is to collect various metrics and send them to InfluxDB for storage and analysis, operating as a lightweight agent with minimal resource overhead.
Requirements¶
- Ansible 2.9+
- Target system with internet access (for package installation).
- A running InfluxDB instance (local or remote) to receive metrics.
Installation / Quick Start¶
The telegraf role installs and configures Telegraf on target hosts.
Basic Installation¶
- hosts: monitoring_clients
become: true
roles:
- role: jackaltx.solti_monitoring.telegraf
vars:
telegraf_output_influxdb: true
telegraf_output_url: "http://monitor.example.com:8086"
telegraf_output_token: "{{ vault_telegraf_token }}" # Store in Ansible Vault
telegraf_output_org: "myorg"
telegraf_output_bucket: "telegraf"
Role Variables / Configuration¶
Output Configuration¶
| Name | Description | Default |
|---|---|---|
telegraf_output_influxdb |
Enable InfluxDB v2 output plugin. | true |
telegraf_output_url |
URL of the InfluxDB instance. | "" |
telegraf_output_token |
Authentication token for InfluxDB. | "" |
telegraf_output_org |
InfluxDB organization name. | "" |
telegraf_output_bucket |
InfluxDB bucket name. | telegraf |
Input Plugins¶
System metrics (cpu, disk, diskio, mem, net, system, processes) are enabled by default.
| Name | Description | Default |
|---|---|---|
telegraf_enable_docker |
Enable Docker container metrics. | false |
telegraf_enable_nginx |
Enable Nginx web server metrics. | false |
telegraf_enable_postgresql |
Enable PostgreSQL database metrics. | false |
telegraf_enable_redis |
Enable Redis metrics. | false |
Global Tags¶
| Name | Description | Default |
|---|---|---|
telegraf_global_tags |
Dictionary of custom tags to add to all collected metrics. | {} |
Usage¶
Testing Configuration¶
Validate Telegraf configuration without starting the service:
Example Configurations¶
The role supports various configurations, such as basic system monitoring, remote collection via WireGuard, and multi-output setups.
Service Management¶
Telegraf runs as a systemd service.
# Check status
systemctl status telegraf
# Start/stop/restart
systemctl start telegraf
systemctl stop telegraf
systemctl restart telegraf
# Enable at boot
systemctl enable telegraf
Troubleshooting¶
Check Logs¶
Test Connection to InfluxDB¶
Verify Metrics Collection¶
Common Issues¶
- Connection refused: InfluxDB not reachable (check network, InfluxDB status, firewall).
- Authentication failed: Invalid token (verify token permissions).
- No data in InfluxDB: Metrics not being sent (check logs, output config, test with
--test).
Role-Specific Sections¶
Performance Tuning¶
- Collection Interval: Adjust
telegraf_intervalandtelegraf_flush_intervalfor collection frequency. - Metric Filtering: Use
telegraf_metric_filtersto reduce metric volume.
Security Considerations¶
- Store tokens in Ansible Vault.
- Use HTTPS endpoints when possible.
- Employ WireGuard for remote collectors.
- Grant minimum required permissions (least privilege).
Reference Deployment¶
Refer to the Reference Deployments chapter for real-world examples, such as monitor11.example.com (server with local Telegraf) and ispconfig3.example.com (client shipping metrics via WireGuard).
Reference¶
License¶
MIT
Author¶
Created by jackaltx and Claude.