Apache HTTP Server

Monitor Apache web server logs with custom parsing and real-time processing using the File Stream plugin

Apache HTTP Server

Monitor and analyze Apache HTTP Server logs in real-time using LogFlux Agent’s File Stream plugin. This configuration-based approach provides comprehensive log parsing, custom format support, and advanced analytics for Apache web servers.

Overview

The Apache HTTP Server integration leverages LogFlux Agent’s File Stream plugin to:

  • Real-time monitoring of access and error logs
  • Custom log format parsing for Combined, Common, and custom formats
  • Performance analytics with response time and bandwidth tracking
  • Security monitoring with attack pattern detection
  • Virtual host support for multi-site deployments
  • Automatic log rotation handling

Installation

The File Stream plugin is included with LogFlux Agent. Enable it for Apache log monitoring:

1
2
3
4
5
# Enable File Stream plugin
sudo systemctl enable --now logflux-filestream

# Verify plugin status
sudo systemctl status logflux-filestream

Basic Configuration

Configure the File Stream plugin to monitor Apache logs by creating /etc/logflux-agent/plugins/filestream-apache.toml:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
[filestream.apache_access]
paths = ["/var/log/apache2/access.log"]
format = "apache_combined"
tags = ["apache", "access", "web"]
fields = {
  service = "apache",
  log_type = "access"
}

[filestream.apache_error]
paths = ["/var/log/apache2/error.log"]  
format = "apache_error"
tags = ["apache", "error", "web"]
fields = {
  service = "apache", 
  log_type = "error"
}

Apache Log Formats

Combined Log Format (Default)

1
LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" combined

Configuration:

1
2
3
4
5
[filestream.apache_combined]
paths = ["/var/log/apache2/access.log"]
format = "apache_combined"
parse_timestamp = true
timestamp_format = "[02/Jan/2006:15:04:05 -0700]"

Common Log Format

1
LogFormat "%h %l %u %t \"%r\" %>s %O" common

Configuration:

1
2
3
4
5
[filestream.apache_common]
paths = ["/var/log/apache2/access.log"]
format = "apache_common"  
parse_timestamp = true
timestamp_format = "[02/Jan/2006:15:04:05 -0700]"

Custom Log Format with Response Time

1
LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\" %D" custom

Configuration:

1
2
3
4
5
6
7
[filestream.apache_custom]
paths = ["/var/log/apache2/access.log"]
format = "regex"
regex = '^(?P<remote_addr>\S+) (?P<remote_logname>\S+) (?P<remote_user>\S+) \[(?P<time_local>[^\]]+)\] "(?P<request>[^"]*)" (?P<status>\d+) (?P<body_bytes_sent>\S+) "(?P<http_referer>[^"]*)" "(?P<http_user_agent>[^"]*)" (?P<request_time>\d+)$'
parse_timestamp = true
timestamp_field = "time_local"
timestamp_format = "02/Jan/2006:15:04:05 -0700"

Advanced Configuration

Multi-Site Virtual Host Monitoring

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
[filestream.apache_site1]
paths = ["/var/log/apache2/site1_access.log"]
format = "apache_combined"
tags = ["apache", "access", "site1"]
fields = {
  service = "apache",
  site = "example.com",
  log_type = "access"
}

[filestream.apache_site2]
paths = ["/var/log/apache2/site2_access.log"] 
format = "apache_combined"
tags = ["apache", "access", "site2"]
fields = {
  service = "apache",
  site = "api.example.com", 
  log_type = "access"
}

SSL/TLS Log Monitoring

1
2
3
4
5
6
7
8
9
[filestream.apache_ssl]
paths = ["/var/log/apache2/ssl_access.log"]
format = "regex"
regex = '^(?P<remote_addr>\S+) (?P<remote_logname>\S+) (?P<remote_user>\S+) \[(?P<time_local>[^\]]+)\] "(?P<request>[^"]*)" (?P<status>\d+) (?P<body_bytes_sent>\S+) "(?P<http_referer>[^"]*)" "(?P<http_user_agent>[^"]*)" (?P<ssl_protocol>\S+) (?P<ssl_cipher>\S+)$'
tags = ["apache", "ssl", "security"]
fields = {
  service = "apache",
  log_type = "ssl_access"
}

Performance Monitoring with Detailed Metrics

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
[filestream.apache_performance]
paths = ["/var/log/apache2/access.log"]
format = "regex"
regex = '^(?P<remote_addr>\S+) (?P<remote_logname>\S+) (?P<remote_user>\S+) \[(?P<time_local>[^\]]+)\] "(?P<request_method>\S+) (?P<request_uri>\S+) (?P<request_protocol>\S+)" (?P<status>\d+) (?P<body_bytes_sent>\S+) "(?P<http_referer>[^"]*)" "(?P<http_user_agent>[^"]*)" (?P<request_time>\d+) (?P<bytes_received>\d+)$'
tags = ["apache", "performance"]
fields = {
  service = "apache",
  log_type = "performance"
}

# Add calculated fields
[filestream.apache_performance.processors.add_fields]
fields = {
  response_time_ms = "{{ .request_time | div 1000 }}",
  bandwidth_mbps = "{{ add .body_bytes_sent .bytes_received | div 1048576 }}"
}

Error Log Configuration

Standard Error Logs

1
2
3
4
5
6
7
8
9
[filestream.apache_error]
paths = ["/var/log/apache2/error.log"]
format = "regex"
regex = '^\[(?P<timestamp>[^\]]+)\] \[(?P<module>[^:]+):(?P<level>[^\]]+)\] \[pid (?P<pid>\d+):tid (?P<tid>\d+)\] (?:\[client (?P<client_ip>[^\]]+)\] )?(?P<message>.*)$'
tags = ["apache", "error"]
fields = {
  service = "apache",
  log_type = "error"
}

Security Error Monitoring

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
[filestream.apache_security]
paths = ["/var/log/apache2/error.log"]
format = "regex"
regex = '^\[(?P<timestamp>[^\]]+)\] \[(?P<module>[^:]+):(?P<level>[^\]]+)\] \[pid (?P<pid>\d+):tid (?P<tid>\d+)\] \[client (?P<client_ip>[^\]]+)\] (?P<message>.*)$'
tags = ["apache", "security", "error"]
fields = {
  service = "apache",
  log_type = "security_error"
}

# Filter for security-related errors
[filestream.apache_security.processors.grep]
patterns = [
  "ModSecurity",
  "attack",
  "blocked",
  "denied",
  "forbidden",
  "unauthorized"
]

Usage Examples

Monitor Apache Access Logs

1
2
3
4
5
6
7
8
# Stream Apache access logs
logflux-cli stream --filter 'tags:apache AND tags:access'

# Monitor specific virtual host
logflux-cli stream --filter 'service:apache AND site:example.com'

# Track high response times
logflux-cli stream --filter 'service:apache AND request_time:>5000'

Security Monitoring

1
2
3
4
5
6
7
8
# Monitor failed authentication attempts
logflux-cli stream --filter 'service:apache AND status:401'

# Track potential attacks
logflux-cli stream --filter 'service:apache AND (status:403 OR status:404)'

# Monitor SSL/TLS connections
logflux-cli stream --filter 'tags:ssl AND service:apache'

Performance Analysis

1
2
3
4
5
6
7
8
# Monitor slow requests (>1 second)
logflux-cli stream --filter 'service:apache AND request_time:>1000000'

# Track bandwidth usage
logflux-cli stream --filter 'service:apache AND body_bytes_sent:>1048576'

# Monitor error rates
logflux-cli stream --filter 'service:apache AND status:>=500'

Apache Configuration

Enable Detailed Logging

Add to your Apache configuration (/etc/apache2/apache2.conf or virtual host):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Enhanced combined format with response time
LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\" %D %I" enhanced
CustomLog /var/log/apache2/access.log enhanced

# Error log with detailed information  
LogLevel info
ErrorLog /var/log/apache2/error.log

# SSL access log (for HTTPS sites)
CustomLog /var/log/apache2/ssl_access.log "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\" %{SSL_PROTOCOL}x %{SSL_CIPHER}x"

Log Rotation Configuration

Create /etc/logrotate.d/apache2:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
/var/log/apache2/*.log {
    daily
    missingok
    rotate 52
    compress
    delaycompress
    notifempty
    create 0644 www-data adm
    postrotate
        /bin/systemctl reload apache2.service > /dev/null 2>&1 || true
        /bin/systemctl reload logflux-filestream.service > /dev/null 2>&1 || true
    endpostrotate
}

Monitoring and Alerting

Key Metrics to Monitor

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# High error rate alert
[alerts.apache_error_rate]
query = "service:apache AND status:>=400"
threshold = 100
window = "5m"
message = "High error rate detected in Apache"

# Response time alert
[alerts.apache_slow_response]
query = "service:apache AND request_time:>10000000"
threshold = 10
window = "1m"  
message = "Slow Apache responses detected"

# Security alert
[alerts.apache_security]
query = "service:apache AND (tags:security OR status:403)"
threshold = 5
window = "1m"
message = "Security events detected in Apache"

Performance Dashboards

Monitor these key Apache metrics:

  • Request rate (requests per second)
  • Response times (average, 95th percentile)
  • Error rates (4xx, 5xx responses)
  • Bandwidth usage (bytes transferred)
  • Active connections (concurrent users)
  • Top pages (most requested URLs)
  • User agents (browser/bot analysis)
  • Geographic distribution (client IP analysis)

Troubleshooting

Common Issues

Log files not being read:

1
2
3
4
5
6
7
8
# Check file permissions
sudo ls -la /var/log/apache2/

# Verify LogFlux can read Apache logs
sudo -u logflux-agent cat /var/log/apache2/access.log

# Check plugin status
sudo systemctl status logflux-filestream

Parsing errors:

1
2
3
4
5
6
7
8
# Test regex pattern
echo "sample log line" | grep -P "your_regex_pattern"

# Check LogFlux Agent logs
sudo journalctl -u logflux-filestream -f

# Validate configuration
logflux-agent -config-test

Missing log entries:

1
2
3
4
5
6
7
8
9
# Verify Apache is logging
sudo tail -f /var/log/apache2/access.log

# Check log rotation
sudo logrotate -d /etc/logrotate.d/apache2

# Restart services
sudo systemctl restart apache2
sudo systemctl restart logflux-filestream  

Best Practices

Security

  • Sanitize sensitive data in logs (credit cards, passwords)
  • Monitor authentication failures and brute force attempts
  • Track admin access to sensitive areas
  • Implement log integrity checks with checksums

Performance

  • Use log sampling for high-traffic sites to reduce overhead
  • Implement log buffering for better I/O performance
  • Monitor disk space used by log files
  • Use SSD storage for log files when possible

Maintenance

  • Regular log rotation to prevent disk space issues
  • Archive old logs for compliance requirements
  • Monitor LogFlux Agent resource usage
  • Test configuration changes in staging environment

Log Format Optimization

1
2
3
4
5
# Production optimized format
LogFormat "%h %t \"%r\" %>s %O %D" minimal

# Development detailed format  
LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\" %D %I %{Host}i" detailed

Integration with Apache Modules

ModSecurity Integration

1
2
3
4
5
6
7
8
[filestream.apache_modsecurity]
paths = ["/var/log/apache2/modsec_audit.log"]
format = "json"
tags = ["apache", "modsecurity", "security"]
fields = {
  service = "apache",
  log_type = "modsecurity"
}

Apache Status Monitoring

1
2
3
4
5
# Enable mod_status
echo "LoadModule status_module modules/mod_status.so" >> /etc/apache2/mods-enabled/status.load

# Monitor via curl and log
curl -s http://localhost/server-status?auto | logger -t apache-status

Migration from Other Tools

From AWStats

Replace AWStats analysis with LogFlux real-time processing:

1
2
3
4
5
6
7
8
9
[filestream.apache_awstats]
paths = ["/var/log/apache2/access.log"]
format = "apache_combined"
tags = ["apache", "analytics"]

# Add geographic processing
[filestream.apache_awstats.processors.geoip]
source_field = "remote_addr"
target_field = "geo"

From GoAccess

1
2
3
# Replace GoAccess real-time with LogFlux
# Old: goaccess /var/log/apache2/access.log --real-time-html
# New: LogFlux real-time dashboard with Apache integration

This comprehensive Apache HTTP Server integration provides real-time log monitoring, security analysis, and performance tracking using LogFlux Agent’s File Stream plugin. The configuration-based approach offers flexibility for different Apache setups while maintaining high performance and reliability.

Disclaimer

Apache HTTP Server and the Apache HTTP Server logo are trademarks of The Apache Software Foundation. LogFlux is not affiliated with, endorsed by, or sponsored by The Apache Software Foundation or the Apache HTTP Server project. The Apache logo is used solely for identification purposes to indicate compatibility with Apache HTTP Server logs.