Storage Configuration
Overview¶
This was prepared for me when I was researching rentention periods. Metric build fast, there are some ways to down sample that I may experiment with in the future.
Both InfluxDB and Loki support multiple storage backends for different use cases. This page covers storage configuration options and trade-offs.
Storage Backend Options¶
Local Filesystem¶
Description: Data stored on local disk attached to server
Pros:
- Simple setup
- Fast access
- No external dependencies
Cons:
- Limited scalability
- No built-in replication
- Capacity limited by local disk
Use case: Development, small deployments, low data volume
Configuration:
# InfluxDB
influxdb_storage_type: "local"
influxdb_data_path: "/var/lib/influxdb2"
# Loki
loki_storage_type: "filesystem"
loki_data_path: "/var/lib/loki"
NFS Mounts¶
Description: Data stored on network-attached storage via NFS
Pros:
- Centralized storage
- Easy to scale capacity
- Shared access across hosts
Cons:
- Network latency
- NFS overhead
- Single point of failure (without HA NFS)
Use case: Medium deployments with shared storage infrastructure
Configuration:
# Mount NFS first
- name: Mount NFS storage
mount:
path: /mnt/nfs/monitoring
src: "storage.example.com:/exports/monitoring"
fstype: nfs
opts: "rw,sync"
state: mounted
# Then configure applications
influxdb_data_path: "/mnt/nfs/monitoring/influxdb"
loki_data_path: "/mnt/nfs/monitoring/loki"
S3-Compatible Object Storage¶
Description: Data stored in S3-compatible object storage (AWS S3, MinIO, etc.)
Pros:
- Virtually unlimited capacity
- Cost-effective for large volumes
- Built-in replication and durability
- Separate compute from storage
Cons:
- Slightly higher latency than local
- Requires object storage infrastructure
- Additional cost for object storage
Use case: Large deployments, long retention periods, cost optimization
InfluxDB with S3:
influxdb_storage_type: "s3"
influxdb_s3_endpoint: "storage.example.com:8010"
influxdb_s3_bucket: "influx11"
influxdb_s3_access_key: "{{ vault_s3_access }}"
influxdb_s3_secret_key: "{{ vault_s3_secret }}"
influxdb_s3_retention: "30d"
Loki with S3:
loki_storage_type: "s3"
loki_s3_endpoint: "storage.example.com:8010"
loki_s3_bucket: "loki11"
loki_s3_access_key: "{{ vault_s3_access }}"
loki_s3_secret_key: "{{ vault_s3_secret }}"
Storage Sizing¶
InfluxDB Storage¶
Factors affecting size:
- Number of hosts
- Metrics per host
- Collection frequency
- Retention period
- Cardinality (unique tag combinations)
Estimation formula:
Storage = Hosts × Metrics × Frequency × Retention × Overhead
Example:
10 hosts × 100 metrics × 6 samples/min × 30 days × 10 bytes
= 10 × 100 × 8640 samples × 10 bytes
= 86.4 GB
Overhead factor: Add 50-100% for indexes and compaction
Loki Storage¶
Factors affecting size:
- Number of hosts
- Log volume per host (lines/sec)
- Average log line size
- Retention period
- Compression ratio
Estimation formula:
Storage = Hosts × LogRate × LineSize × Retention / Compression
Example:
10 hosts × 10 lines/sec × 200 bytes × 30 days / 10
= 10 × 10 × 200 × 2592000 / 10
= 5.18 GB
Compression: Loki typically achieves 5-10x compression
Retention Policies¶
Time-Based Retention¶
InfluxDB:
influxdb_retention: "30d" # Keep 30 days
influxdb_retention: "90d" # Keep 90 days
influxdb_retention: "365d" # Keep 1 year
influxdb_retention: "0" # Keep forever (not recommended)
Loki:
Retention Best Practices¶
- Short-term (7-30 days): Local storage, fast queries
- Medium-term (30-90 days): S3 storage, cost-effective
- Long-term (90+ days): S3 storage with longer retention
- Archive: Export to cold storage for compliance
Retention Strategy Example¶
# Local storage for recent data (fast access)
influxdb_local:
retention: "7d"
path: "/var/lib/influxdb2"
# S3 storage for historical data (cost-effective)
influxdb_s3:
retention: "365d"
bucket: "influx-historical"
Performance Optimization¶
Local Storage Performance¶
Disk type:
- SSD: Recommended for production
- NVMe: Best performance for high write rates
- HDD: Acceptable for low-volume or S3-backed deployments
Filesystem:
- ext4: Good general-purpose choice
- xfs: Better for large files
- btrfs: Supports snapshots (useful for backups)
Mount options:
S3 Performance¶
Optimize chunk sizes:
# Loki
loki_max_chunk_age: "2h" # Larger chunks = fewer S3 requests
loki_chunk_idle_period: "30m"
# InfluxDB
influxdb_cache_max_memory: "1g" # More caching = fewer S3 reads
Concurrent operations:
Storage Migration¶
Local to S3 Migration¶
InfluxDB:
- Deploy new InfluxDB with S3 backend
- Use
influx backupto export data - Use
influx restoreto import to new instance - Update Telegraf clients to new endpoint
- Decommission old instance
Loki:
- Deploy new Loki with S3 backend
- Update Alloy clients to send to both old and new
- Wait for retention period on old instance
- Remove old Loki endpoint from Alloy configs
- Decommission old instance
Expanding Storage¶
Local storage:
# Add new disk
# Create filesystem
mkfs.ext4 /dev/sdc1
# Mount new disk
mount /dev/sdc1 /mnt/new_storage
# Stop services
systemctl stop influxdb loki
# Move data
rsync -av /var/lib/influxdb2/ /mnt/new_storage/influxdb/
rsync -av /var/lib/loki/ /mnt/new_storage/loki/
# Update mount points in systemd units
# Restart services
systemctl start influxdb loki
Backup Strategies¶
Local Storage Backup¶
# InfluxDB backup
systemctl stop influxdb
tar -czf influxdb-backup-$(date +%Y%m%d).tar.gz /var/lib/influxdb2/
systemctl start influxdb
# Loki backup
systemctl stop loki
tar -czf loki-backup-$(date +%Y%m%d).tar.gz /var/lib/loki/
systemctl start loki
S3 Storage Backup¶
When using S3 backend:
- Data already stored in object storage (durable)
- Enable S3 versioning for point-in-time recovery
- Use S3 replication for disaster recovery
- No application-level backup needed
Cost Optimization¶
Storage Tier Strategy¶
- Hot tier (0-7 days): Local SSD for fast access
- Warm tier (7-30 days): S3 standard storage
- Cold tier (30+ days): S3 infrequent access or Glacier
Compression¶
Loki: Built-in compression (5-10x)
InfluxDB: Uses Snappy compression automatically
Retention Tuning¶
Balance retention with cost:
# Short retention for high-cardinality metrics
metrics_detailed:
retention: "7d"
# Long retention for aggregate/summary metrics
metrics_summary:
retention: "365d"
Monitoring Storage¶
Disk Usage¶
# Check disk usage
df -h /var/lib/influxdb2
df -h /var/lib/loki
# Check growth rate
du -sh /var/lib/influxdb2
du -sh /var/lib/loki
S3 Usage¶
Check bucket size via S3 API or web console.
Alerts¶
Set up alerts for:
- Disk space < 20% free
- Storage growth rate exceeding projections
- S3 API errors
- High storage latency
Reference Deployment¶
See Reference Deployments chapter for real-world S3 storage configuration:
- monitor11.example.com - Both InfluxDB and Loki with S3 backend