Loki Role¶
Overview¶
Grafana Loki is a log aggregation system inspired by Prometheus. It stores logs efficiently and provides powerful querying via LogQL. Its purpose is to store logs from Alloy collectors, provide a query API for Grafana dashboards, enable log aggregation across multiple hosts, and support S3-compatible object storage backends.
Requirements¶
- Ansible 2.9+
- Target system with Podman and Systemd.
- An S3-compatible object storage (optional, for
s3backend).
Installation / Quick Start¶
The loki role deploys Loki as a Podman container using systemd quadlets.
Basic Deployment¶
- hosts: monitoring_servers
become: true
roles:
- role: jackaltx.solti_monitoring.loki
vars:
loki_retention: "30d"
S3-Backed Deployment¶
- hosts: monitoring_servers
become: true
roles:
- role: jackaltx.solti_monitoring.loki
vars:
loki_storage_type: "s3"
loki_s3_endpoint: "storage.example.com:8010"
loki_s3_bucket: "loki11"
loki_s3_access_key: "{{ vault_s3_access }}" # Store in Ansible Vault
loki_s3_secret_key: "{{ vault_s3_secret }}" # Store in Ansible Vault
Role Variables / Configuration¶
Basic Configuration¶
| Name | Description | Default |
|---|---|---|
loki_version |
The Loki version to deploy. | 2.9 |
loki_port |
The HTTP API port for Loki. | 3100 |
loki_retention |
The log retention period. | 30d |
loki_max_chunk_age |
Maximum chunk age before flush. | 2h |
Storage Backends¶
| Name | Description | Default |
|---|---|---|
loki_storage_type |
The storage backend type (filesystem or s3). |
filesystem |
loki_data_path |
Path for filesystem storage. | /var/lib/loki |
loki_s3_endpoint |
S3 endpoint URL. | "" |
loki_s3_bucket |
S3 bucket name. | "" |
loki_s3_access_key |
S3 access key (store in Ansible Vault). | "" |
loki_s3_secret_key |
S3 secret key (store in Ansible Vault). | "" |
loki_s3_region |
S3 region (optional). | us-east-1 |
Container Configuration¶
| Name | Description | Default |
|---|---|---|
loki_container_name |
The name for the Loki container. | loki |
loki_image |
The Docker image for Loki. | docker.io/grafana/loki:2.9 |
loki_restart_policy |
The container restart policy. | always |
Usage¶
Health Check¶
Verify Loki is running and ready to serve requests:
API Access¶
Loki provides a query API for LogQL and a push API for ingesting logs (used by Alloy).
- Query API: Supports instant and range queries for logs.
- Push API: Used by clients like Alloy to send logs to Loki.
- Label Discovery: Endpoints available to list all labels and their values.
Service Management¶
Loki runs as a Podman container managed by systemd.
# Check status
systemctl status loki
# Start/stop/restart
systemctl start loki
systemctl stop loki
systemctl restart loki
# View logs
journalctl -u loki -f
# Check container status
podman ps | grep loki
Troubleshooting¶
Check Container Status¶
podman ps -a | grep lokipodman logs loki
Check Service Status¶
systemctl status lokijournalctl -u loki -n 100
Verify API Access¶
curl http://localhost:3100/readycurl http://localhost:3100/metrics
Test Query¶
curl -G "http://localhost:3100/loki/api/v1/query" \
--data-urlencode 'query={service_type="fail2ban"}' \
--data-urlencode 'limit=5'
Common Issues¶
- Container won't start: Check
podman logs loki. - API not accessible: Verify port
3100is open, check firewall. - No logs appearing: Check Alloy collectors are configured correctly.
- Out of disk space: Reduce retention period or use S3 backend.
- High memory usage: Reduce chunk cache sizes or add more RAM.
- Query timeouts: Optimize queries, add time range filters.
Role-Specific Sections¶
LogQL Query Language¶
Loki uses LogQL, a powerful query language for filtering and aggregating logs. It supports basic queries, log parsing (regex, JSON), and various aggregations like count_over_time and rate.
Retention Configuration¶
Log retention can be configured via loki_retention. Loki automatically handles chunk compaction.
Resource Requirements¶
Minimum requirements are 2 CPU cores and 1GB RAM. Sizing guidance is provided for small, medium, and large deployments, considering log volume and retention.
Performance Tuning¶
Chunk configuration, cache settings, and ingestion rate limits can be tuned for better performance.
Backup and Recovery¶
- Filesystem Backend: Requires manual backup of the data directory after stopping Loki.
- S3 Backend: Data is automatically stored in object storage, simplifying disaster recovery.
Monitoring Loki¶
Loki exposes Prometheus metrics at /metrics, which can be scraped for monitoring its own health and performance.
Security Considerations¶
- Restrict network access to port
3100. - Consider authentication via a reverse proxy and using HTTPS in production.
- Store S3 credentials in Ansible Vault.
- Practice log sanitization to avoid logging sensitive data.
Reference¶
License¶
MIT
Author¶
Created by jackaltx and Claude.