StatsD Integration

Collect metrics via StatsD protocol with LogFlux Agent

StatsD

The LogFlux StatsD plugin provides a StatsD-compatible metrics receiver that collects application metrics and forwards them to LogFlux as structured log entries. This allows you to centralize both logs and metrics in a single platform while maintaining compatibility with existing StatsD client libraries.

Overview

The StatsD plugin provides:

  • StatsD Protocol Compatibility: Full support for StatsD metric types and wire protocol
  • UDP and TCP Support: Receive metrics over UDP (default) or TCP connections
  • Metric Aggregation: Built-in aggregation with configurable flush intervals
  • Multiple Metric Types: Support for counters, gauges, timers, histograms, and sets
  • Tag Support: Extended StatsD format with tags for richer metric metadata
  • Batch Processing: Efficient batching for high-volume metric streams
  • Real-time Forwarding: Immediate forwarding of metrics to LogFlux

Installation

The StatsD plugin is included with the LogFlux Agent but disabled by default.

Prerequisites

  • LogFlux Agent installed (see Installation Guide)
  • Network access to StatsD port (default: 8125)

Enable the Plugin

1
2
3
4
5
# Enable and start the StatsD plugin
sudo systemctl enable --now logflux-statsd

# Check status
sudo systemctl status logflux-statsd

Configuration

Basic Configuration

Create or edit the StatsD plugin configuration:

1
sudo nano /etc/logflux-agent/plugins/statsd.yaml

Basic configuration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# StatsD Plugin Configuration
statsd:
  # Bind address and port
  bind_address: "0.0.0.0:8125"
  
  # Protocol (udp, tcp, or both)
  protocol: "udp"
  
  # Flush interval for aggregated metrics
  flush_interval: 10s
  
  # Maximum UDP packet size
  max_udp_packet_size: 1432

# Metric processing
aggregation:
  # Enable metric aggregation
  enabled: true
  
  # Percentiles for timer metrics
  percentiles: [50, 90, 95, 99]
  
  # Counter reset behavior
  delete_counters: true
  delete_gauges: false
  delete_timers: true
  delete_sets: true

# Log formatting
output:
  # Add metric metadata
  include_metadata: true
  
  # Metric name prefix
  prefix: "metrics"

Advanced Configuration

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
# Advanced StatsD Plugin Configuration
statsd:
  # Network settings
  bind_address: "0.0.0.0:8125"
  protocol: "both"  # Accept both UDP and TCP
  
  # Buffer settings
  read_buffer_size: 65536
  max_udp_packet_size: 1432
  tcp_keep_alive: true
  tcp_timeout: 30s
  
  # Performance settings
  worker_threads: 4
  queue_size: 10000
  
  # Flush settings
  flush_interval: 10s
  flush_jitter: 1s  # Random jitter to prevent thundering herd

# Metric aggregation
aggregation:
  enabled: true
  
  # Timer/histogram settings
  percentiles: [50, 90, 95, 99, 99.9]
  histogram_buckets: [0.1, 0.5, 1, 2.5, 5, 10]
  
  # Cleanup behavior
  delete_counters: true
  delete_gauges: false
  delete_timers: true
  delete_sets: true
  delete_histograms: true

# Metric filtering
filters:
  # Include only specific metric patterns
  include_patterns:
    - "^app\\..*"
    - "^service\\..*"
  
  # Exclude metric patterns
  exclude_patterns:
    - "^debug\\..*"
    - "^test\\..*"
  
  # Tag filtering
  include_tags:
    - "environment=production"
    - "service=web"
  
  exclude_tags:
    - "debug=true"

# Rate limiting
rate_limiting:
  # Enable rate limiting
  enabled: true
  
  # Maximum metrics per second
  max_metrics_per_second: 10000
  
  # Burst capacity
  burst_capacity: 50000

# Output formatting
output:
  # Metric metadata
  include_metadata: true
  include_tags: true
  include_timestamp: true
  
  # Field naming
  metric_name_field: "metric_name"
  metric_value_field: "value"
  metric_type_field: "type"
  
  # Prefixes
  prefix: "metrics"
  tag_prefix: "tag."

# Multi-line support (for structured metrics)
multiline:
  # Support for JSON metrics
  patterns:
    - pattern: '^{.*}$'
      codec: "json"

Usage Examples

Basic Metrics Collection

1
2
3
4
5
6
7
# Start StatsD plugin with default configuration
sudo systemctl enable --now logflux-statsd

# Test with netcat
echo "test.counter:1|c" | nc -u localhost 8125
echo "test.gauge:42|g" | nc -u localhost 8125
echo "test.timer:320|ms" | nc -u localhost 8125

Application Integration

Node.js Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
const StatsD = require('node-statsd');
const client = new StatsD({
  host: 'localhost',
  port: 8125,
  prefix: 'myapp.'
});

// Counter
client.increment('requests.total');
client.increment('requests.by_status', 1, ['status:200']);

// Gauge
client.gauge('active_connections', 42);

// Timer
client.timing('request.duration', 156);

// Histogram
client.histogram('response.size', 1024, ['endpoint:/api/users']);

Python Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
import statsd

# Create client
client = statsd.StatsClient('localhost', 8125, prefix='myapp')

# Counter
client.incr('requests.total')
client.incr('requests.by_status', tags=['status:200'])

# Gauge  
client.gauge('active_connections', 42)

# Timer
with client.timer('request.duration'):
    # Your code here
    pass

# Histogram
client.histogram('response.size', 1024, tags=['endpoint:/api/users'])

Go Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
package main

import (
    "github.com/DataDog/datadog-go/statsd"
    "time"
)

func main() {
    client, _ := statsd.New("localhost:8125", 
        statsd.WithPrefix("myapp."))
    defer client.Close()

    // Counter
    client.Incr("requests.total", []string{"status:200"}, 1)

    // Gauge
    client.Gauge("active_connections", 42, []string{}, 1)

    // Timer
    start := time.Now()
    // Your code here
    client.Timing("request.duration", time.Since(start), []string{}, 1)

    // Histogram
    client.Histogram("response.size", 1024, []string{"endpoint:/api/users"}, 1)
}

Metric Types and Formats

StatsD Wire Protocol

The plugin supports the standard StatsD wire protocol:

<metric_name>:<value>|<type>[|@<sample_rate>][#<tag1>:<value1>,<tag2>:<value2>]

Supported Metric Types

Counters (|c):

1
2
3
4
5
6
7
8
# Basic counter
requests.total:1|c

# Counter with sample rate
requests.total:1|c|@0.1

# Counter with tags
requests.total:1|c#environment:production,service:web

Gauges (|g):

1
2
3
4
5
6
# Absolute gauge
memory.usage:1024|g

# Relative gauge (delta)
memory.usage:+100|g
memory.usage:-50|g

Timers/Histograms (|ms or |h):

1
2
3
4
5
# Timer in milliseconds
request.duration:156|ms

# Histogram
response.size:1024|h#endpoint:/api/users

Sets (|s):

1
2
3
# Unique value counting
unique.users:user123|s
unique.users:user456|s

Extended Format with Tags

LogFlux StatsD supports tagged metrics:

1
2
3
4
5
# Tagged counter
http.requests:1|c#method:GET,status:200,endpoint:/api/users

# Tagged gauge with multiple dimensions
system.cpu.usage:75.5|g#host:web01,core:0,type:user

Metric Aggregation

Aggregation Behavior

The StatsD plugin aggregates metrics over flush intervals:

  • Counters: Summed over the interval
  • Gauges: Last value received
  • Timers: Count, sum, mean, percentiles, min, max
  • Histograms: Distribution buckets and percentiles
  • Sets: Unique count

Timer Aggregation Example

Input over 10-second interval:

request.duration:100|ms
request.duration:200|ms
request.duration:150|ms

Output to LogFlux:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
{
  "timestamp": "2024-01-20T14:30:50.000Z",
  "level": "info",
  "message": "StatsD metric",
  "metric_name": "request.duration",
  "type": "timer",
  "count": 3,
  "sum": 450,
  "mean": 150,
  "min": 100,
  "max": 200,
  "percentile_50": 150,
  "percentile_90": 200,
  "percentile_95": 200,
  "percentile_99": 200
}

Performance Optimization

High-Volume Configuration

For high-volume metrics environments:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# High-performance configuration
statsd:
  bind_address: "0.0.0.0:8125"
  protocol: "udp"
  read_buffer_size: 131072
  worker_threads: 8
  queue_size: 50000

aggregation:
  flush_interval: 5s
  percentiles: [95, 99]  # Reduce percentile calculations

rate_limiting:
  enabled: true
  max_metrics_per_second: 50000
  burst_capacity: 100000

Memory Management

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Memory-optimized settings
aggregation:
  delete_counters: true
  delete_timers: true
  delete_sets: true

filters:
  # Reduce metric cardinality
  exclude_patterns:
    - "^debug\\..*"
    - ".*\\.temp\\..*"

Monitoring and Alerting

Plugin Health Monitoring

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
#!/bin/bash
# /usr/local/bin/check-statsd-plugin.sh

if ! systemctl is-active --quiet logflux-statsd; then
    echo "CRITICAL: LogFlux StatsD plugin is not running"
    exit 2
fi

# Check if plugin is receiving metrics
if ! ss -ulnp | grep -q ":8125.*logflux-statsd"; then
    echo "CRITICAL: StatsD port 8125 is not listening"
    exit 2
fi

# Check recent metric collection
if ! journalctl -u logflux-statsd --since="5 minutes ago" | grep -q "metrics processed"; then
    echo "WARNING: No metrics processed in last 5 minutes"
    exit 1
fi

echo "OK: LogFlux StatsD plugin is healthy"
exit 0

Network Monitoring

1
2
3
4
5
6
7
8
# Monitor UDP packet loss
netstat -su | grep -i "packet receive errors"

# Monitor port usage
ss -ulnp | grep :8125

# Check firewall rules
sudo iptables -L | grep 8125

Troubleshooting

Common Issues

Port Already in Use:

1
2
3
4
5
6
# Check what's using port 8125
sudo lsof -i :8125
sudo ss -ulnp | grep 8125

# Kill conflicting process
sudo systemctl stop existing-statsd-service

UDP Packet Loss:

1
2
3
4
# Increase kernel buffer sizes
echo 'net.core.rmem_max = 134217728' >> /etc/sysctl.conf
echo 'net.core.rmem_default = 134217728' >> /etc/sysctl.conf
sudo sysctl -p

High CPU Usage:

1
2
3
4
5
6
7
# Reduce processing load
statsd:
  worker_threads: 2
  flush_interval: 30s

aggregation:
  percentiles: [95]  # Reduce percentile calculations

Memory Issues:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Enable aggressive cleanup
aggregation:
  delete_counters: true
  delete_timers: true
  delete_sets: true
  delete_histograms: true

filters:
  # Reduce metric cardinality
  include_patterns:
    - "^(app|service)\\.*"

Debugging

1
2
3
4
5
6
7
8
9
# Enable verbose logging
sudo journalctl -u logflux-statsd -f

# Test metric reception
echo "test.debug:1|c" | nc -u localhost 8125

# Check configuration
sudo systemctl status logflux-statsd
sudo cat /etc/logflux-agent/plugins/statsd.yaml

Security Considerations

Network Security

1
2
3
4
# Bind to specific interface only
# statsd.yaml
statsd:
  bind_address: "192.168.1.100:8125"  # Internal IP only

Firewall Configuration

1
2
3
4
# Allow StatsD from internal networks only
sudo iptables -A INPUT -p udp --dport 8125 -s 10.0.0.0/8 -j ACCEPT
sudo iptables -A INPUT -p udp --dport 8125 -s 192.168.0.0/16 -j ACCEPT
sudo iptables -A INPUT -p udp --dport 8125 -j DROP

Rate Limiting

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Prevent abuse
rate_limiting:
  enabled: true
  max_metrics_per_second: 1000
  burst_capacity: 5000

filters:
  # Block potentially malicious patterns
  exclude_patterns:
    - ".*\\.\\..*"  # Directory traversal attempts
    - "^[^a-zA-Z0-9._-].*"  # Invalid characters

Integration Examples

Prometheus Integration

Convert StatsD metrics for Prometheus scraping:

1
2
3
4
5
6
7
# prometheus-compatible.yaml
output:
  include_metadata: true
  metric_name_field: "__name__"
  
aggregation:
  histogram_buckets: [0.1, 0.5, 1, 2.5, 5, 10, 25, 50, 100]

Grafana Dashboard

Create Grafana dashboard using LogFlux data source:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
{
  "dashboard": {
    "title": "StatsD Metrics",
    "panels": [
      {
        "title": "Request Rate",
        "type": "graph",
        "targets": [
          {
            "expr": "sum(rate(requests_total[5m])) by (service)"
          }
        ]
      }
    ]
  }
}

Docker Compose

Deploy with Docker Compose:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# docker-compose.yml
version: '3.8'
services:
  app:
    image: myapp:latest
    environment:
      - STATSD_HOST=logflux-agent
      - STATSD_PORT=8125
    
  logflux-agent:
    image: logflux/agent:latest
    ports:
      - "8125:8125/udp"
    volumes:
      - ./statsd.yaml:/etc/logflux-agent/plugins/statsd.yaml
    environment:
      - LOGFLUX_API_KEY=${LOGFLUX_API_KEY}

Best Practices

Metric Naming

  1. Use Hierarchical Names: service.component.metric_name
  2. Include Units: request.duration_ms, memory.usage_bytes
  3. Consistent Naming: Use consistent separators and conventions
  4. Avoid High Cardinality: Limit unique tag combinations

Performance

  1. Use UDP for High Volume: UDP has lower overhead than TCP
  2. Batch Client-Side: Use client libraries that batch metrics
  3. Monitor Buffer Sizes: Increase buffers for high-volume scenarios
  4. Filter Aggressively: Use include/exclude patterns to reduce load

Monitoring

  1. Health Checks: Monitor plugin health and port availability
  2. Packet Loss: Monitor UDP packet loss on high-volume systems
  3. Queue Depth: Monitor internal queue sizes and processing rates
  4. Error Rates: Track metric parsing and forwarding errors

Disclaimer

StatsD and the StatsD logo are trademarks of their respective owners. LogFlux is not affiliated with, endorsed by, or sponsored by the StatsD project or its maintainers. The StatsD logo is used solely for identification purposes to indicate compatibility with the StatsD protocol.

Next Steps