Amazon CloudWatch Integration
Retrieve and stream logs from AWS CloudWatch Logs with LogFlux Agent
CloudWatch
The LogFlux CloudWatch integration retrieves and streams logs from Amazon CloudWatch Logs, enabling centralized log analysis from your AWS infrastructure. This plugin provides seamless integration with AWS CloudWatch Logs service, supporting multiple authentication methods and advanced filtering capabilities.
Overview
The CloudWatch plugin provides:
- CloudWatch Logs Integration: Direct connection to AWS CloudWatch Logs service
- Multiple Authentication Methods: IAM roles, profiles, access keys, and credential chains
- Log Group and Stream Filtering: Target specific log groups and streams
- Pattern Filtering: Apply CloudWatch filter patterns to reduce noise
- Follow Mode: Continuously poll for new log entries in real-time
- Batch Processing: Efficient batching for high-volume log retrieval
- Flexible Time Ranges: Query historical logs or stream real-time data
- Auto-discovery: Discover available log groups automatically
- Cross-Region Support: Connect to CloudWatch in any AWS region
Installation
The CloudWatch plugin is included with the LogFlux Agent but disabled by default.
Prerequisites
- LogFlux Agent installed (see Installation Guide)
- AWS credentials configured (IAM role, AWS CLI profile, or access keys)
- Appropriate IAM permissions for CloudWatch Logs access
- Network connectivity to AWS CloudWatch endpoints
Required IAM Permissions
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:DescribeLogGroups",
"logs:DescribeLogStreams",
"logs:GetLogEvents",
"logs:FilterLogEvents"
],
"Resource": "*"
}
]
}
|
Enable the Plugin
1
2
3
4
5
|
# Enable and start the CloudWatch plugin
sudo systemctl enable --now logflux-cloudwatch
# Check status
sudo systemctl status logflux-cloudwatch
|
Configuration
Basic Configuration
Create or edit the CloudWatch plugin configuration:
1
|
sudo nano /etc/logflux-agent/plugins/cloudwatch.yaml
|
Basic configuration:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
|
# CloudWatch Plugin Configuration
name: cloudwatch
version: 1.0.0
source: cloudwatch-plugin
# Agent connection
agent:
socket_path: /tmp/logflux-agent.sock
# AWS Configuration
aws:
region: us-east-1
profile: "" # AWS profile name (optional)
# Log retrieval settings
cloudwatch:
# Log groups to monitor
log_groups:
- "/aws/lambda/my-function"
- "/aws/apigateway/my-api"
# Specific log streams (optional)
log_streams: []
# Follow mode for real-time streaming
follow: true
poll_interval: 30s
# Maximum events per request
max_events: 10000
# Filter pattern (CloudWatch syntax)
filter_pattern: ""
# Logging metadata
logging:
verbose: false
labels:
plugin: cloudwatch
source: aws
# Batching for efficiency
batch:
enabled: true
size: 100
flush_interval: 5s
|
Advanced Configuration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
|
# Advanced CloudWatch Configuration
name: cloudwatch
version: 1.0.0
source: cloudwatch-plugin
# Enhanced agent settings
agent:
socket_path: /tmp/logflux-agent.sock
connect_timeout: 30s
max_retries: 5
retry_delay: 10s
# AWS Configuration
aws:
region: us-west-2
# Authentication options
profile: "production" # Named AWS profile
# Or explicit credentials (not recommended for production)
# access_key: "AKIA..."
# secret_key: "..."
# session_token: "..." # For temporary credentials
# Advanced CloudWatch settings
cloudwatch:
# Multiple log groups with patterns
log_groups:
- "/aws/lambda/*"
- "/aws/apigateway/*"
- "/aws/ecs/cluster/*"
- "/aws/rds/instance/*/error"
- "/aws/elasticloadbalancing/*"
# Specific log streams
log_streams:
- "2024/01/20/[$LATEST]"
- "application-logs"
# Time range for historical data
start_time: "-1h" # 1 hour ago
end_time: "" # Now (empty = current time)
# Real-time following
follow: true
poll_interval: 15s
# Request limits
max_events: 50000
# Advanced filtering
filter_pattern: '[timestamp, request_id, level="ERROR", ...]'
# Auto-discovery settings
auto_discover: true
discovery_pattern: "/aws/lambda/*"
# Enhanced metadata
logging:
verbose: true
labels:
plugin: cloudwatch
source: aws
environment: production
region: us-west-2
# Custom field mapping
field_mapping:
log_group: "aws_log_group"
log_stream: "aws_log_stream"
event_id: "aws_event_id"
ingestion_time: "aws_ingestion_time"
# Advanced batching
batch:
enabled: true
size: 500
buffer_size: 10000
flush_interval: 10s
# Memory management
max_memory: 100MB
# Monitoring and health
health:
check_interval: 60s
max_api_errors: 10
alert_on_rate_limit: true
|
Usage Examples
Lambda Function Logs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
# Monitor specific Lambda function
sudo logflux-cloudwatch \
-region us-east-1 \
-log-groups "/aws/lambda/my-function" \
-follow
# Monitor multiple Lambda functions
sudo logflux-cloudwatch \
-region us-east-1 \
-log-groups "/aws/lambda/function1,/aws/lambda/function2" \
-follow
# Filter for errors only
sudo logflux-cloudwatch \
-region us-east-1 \
-log-groups "/aws/lambda/my-function" \
-filter-pattern "[timestamp, request_id, level=ERROR, ...]" \
-follow
|
API Gateway Logs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
# API Gateway monitoring
cloudwatch:
log_groups:
- "/aws/apigateway/my-api"
filter_pattern: '[timestamp, request_id, ip, user, timestamp, method, resource, protocol, status, error, ...]'
follow: true
poll_interval: 30s
logging:
labels:
service: api_gateway
log_type: access
|
ECS Container Logs
1
2
3
4
5
6
7
8
9
10
11
12
13
|
# ECS cluster monitoring
cloudwatch:
log_groups:
- "/aws/ecs/containerinsights/my-cluster/application"
- "/aws/ecs/containerinsights/my-cluster/performance"
follow: true
poll_interval: 20s
logging:
labels:
service: ecs
cluster: my-cluster
|
RDS Database Logs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
# RDS error log monitoring
cloudwatch:
log_groups:
- "/aws/rds/instance/prod-db/error"
- "/aws/rds/instance/prod-db/slowquery"
filter_pattern: "ERROR"
follow: true
logging:
labels:
service: rds
database: prod-db
log_type: database
|
Command Line Usage
Basic Commands
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
# Monitor specific log group
logflux-cloudwatch -region us-east-1 -log-groups "/aws/lambda/my-function"
# Follow mode for real-time logs
logflux-cloudwatch -region us-east-1 -log-groups "/aws/lambda/my-function" -follow
# Historical logs from last hour
logflux-cloudwatch -region us-east-1 -log-groups "/aws/lambda/my-function" -start-time "-1h"
# Multiple log groups
logflux-cloudwatch -region us-east-1 -log-groups "/aws/lambda/func1,/aws/lambda/func2"
# Using AWS profile
logflux-cloudwatch -profile production -region us-west-2 -log-groups "/aws/lambda/my-function"
# Specific time range
logflux-cloudwatch -region us-east-1 \
-log-groups "/aws/lambda/my-function" \
-start-time "2024-01-20T10:00:00Z" \
-end-time "2024-01-20T11:00:00Z"
|
Advanced Options
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
|
# Custom filter pattern
logflux-cloudwatch -region us-east-1 \
-log-groups "/aws/lambda/my-function" \
-filter-pattern "[timestamp, request_id, level=ERROR, ...]"
# Specific log streams
logflux-cloudwatch -region us-east-1 \
-log-groups "/aws/lambda/my-function" \
-log-streams "2024/01/20/[$LATEST]a1b2c3d4"
# Custom batch settings
logflux-cloudwatch -region us-east-1 \
-log-groups "/aws/lambda/my-function" \
-batch-size 200 \
-flush-interval 10s
# Explicit AWS credentials (not recommended)
logflux-cloudwatch -region us-east-1 \
-access-key "AKIA..." \
-secret-key "..." \
-log-groups "/aws/lambda/my-function"
# Verbose output
logflux-cloudwatch -region us-east-1 \
-log-groups "/aws/lambda/my-function" \
-verbose
# Configuration file
logflux-cloudwatch -config /etc/logflux-agent/plugins/cloudwatch.yaml
|
Authentication Methods
IAM Role (Recommended)
1
2
3
4
5
|
# EC2 instance with IAM role
# No additional configuration needed - uses instance profile
# ECS task with task role
# Configure task definition with appropriate IAM role
|
AWS Profile
1
2
3
4
5
6
7
8
9
10
11
|
# Configure AWS CLI profile
aws configure --profile production
AWS Access Key ID [None]: AKIA...
AWS Secret Access Key [None]: ...
Default region name [None]: us-east-1
Default output format [None]: json
# Use in configuration
aws:
profile: "production"
region: us-east-1
|
Environment Variables
1
2
3
4
5
6
7
|
# Set AWS credentials via environment
export AWS_ACCESS_KEY_ID="AKIA..."
export AWS_SECRET_ACCESS_KEY="..."
export AWS_REGION="us-east-1"
# Optional session token for temporary credentials
export AWS_SESSION_TOKEN="..."
|
Explicit Credentials
1
2
3
4
5
6
|
# Not recommended for production
aws:
access_key: "AKIA..."
secret_key: "..."
session_token: "" # Optional
region: us-east-1
|
CloudWatch Filter Patterns
Common Filter Patterns
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
# Error logs only
-filter-pattern "ERROR"
# Specific log level
-filter-pattern '[timestamp, request_id, level="ERROR", ...]'
# Multiple conditions
-filter-pattern '[timestamp, request_id, level="ERROR" || level="WARN", ...]'
# Field extraction
-filter-pattern '[timestamp, request_id="*-*-*", level, message]'
# Numeric filtering
-filter-pattern '[timestamp, request_id, level, duration > 1000]'
# Exclude patterns
-filter-pattern '[timestamp, request_id, level != "DEBUG", ...]'
# JSON log filtering
-filter-pattern '{ $.level = "ERROR" }'
# Complex JSON filtering
-filter-pattern '{ ($.level = "ERROR") && ($.service = "api") }'
|
Pattern Examples by Service
Lambda Functions:
1
2
3
4
5
6
7
8
|
# Lambda errors
-filter-pattern "[timestamp, request_id, level=ERROR, ...]"
# Lambda cold starts
-filter-pattern "INIT_START"
# Lambda timeouts
-filter-pattern "Task timed out"
|
API Gateway:
1
2
3
4
5
|
# 4xx/5xx responses
-filter-pattern "[timestamp, request_id, ip, user, timestamp, method, resource, protocol, status>=400, ...]"
# Specific endpoint errors
-filter-pattern '[timestamp, request_id, ip, user, timestamp, method, resource="/api/users", protocol, status>=400, ...]'
|
ECS/Container:
1
2
3
4
5
|
# Container crashes
-filter-pattern "OOMKilled"
# Health check failures
-filter-pattern "Health check failed"
|
The plugin adds CloudWatch-specific metadata:
Field |
Description |
Example |
source_type |
Always “plugin” |
plugin |
source_name |
Always “cloudwatch” |
cloudwatch |
aws_log_group |
CloudWatch log group name |
/aws/lambda/my-function |
aws_log_stream |
CloudWatch log stream name |
2024/01/20/[$LATEST]a1b2c3d4 |
aws_event_id |
CloudWatch event ID |
12345678901234567890 |
aws_ingestion_time |
CloudWatch ingestion timestamp |
1642679850000 |
aws_region |
AWS region |
us-east-1 |
Input CloudWatch Event:
1
2
3
4
5
6
7
8
|
{
"eventId": "12345678901234567890",
"ingestionTime": 1642679850000,
"logGroupName": "/aws/lambda/my-function",
"logStreamName": "2024/01/20/[$LATEST]a1b2c3d4",
"message": "ERROR: Database connection failed",
"timestamp": 1642679850000
}
|
Output LogFlux Log:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
{
"timestamp": "2024-01-20T14:30:50.000Z",
"level": "info",
"message": "ERROR: Database connection failed",
"node": "aws",
"metadata": {
"source_type": "plugin",
"source_name": "cloudwatch",
"aws_log_group": "/aws/lambda/my-function",
"aws_log_stream": "2024/01/20/[$LATEST]a1b2c3d4",
"aws_event_id": "12345678901234567890",
"aws_ingestion_time": 1642679850000,
"aws_region": "us-east-1",
"plugin": "cloudwatch",
"environment": "production"
}
}
|
High-Volume Configuration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
# High-throughput settings
cloudwatch:
max_events: 100000
poll_interval: 10s
batch:
size: 1000
buffer_size: 50000
flush_interval: 30s
max_memory: 500MB
# Use specific log groups to reduce API calls
log_groups:
- "/aws/lambda/high-volume-function"
|
Cost Optimization
1
2
3
4
5
6
7
8
9
10
11
|
# Reduce CloudWatch API calls
cloudwatch:
poll_interval: 60s # Less frequent polling
max_events: 1000 # Smaller batch sizes
# Use filter patterns to reduce data transfer
filter_pattern: "ERROR"
# Target specific log streams
log_streams:
- "recent-stream-name"
|
Regional Optimization
1
2
3
4
5
6
|
# Run plugin in same region as resources
aws:
region: us-east-1 # Same region as log groups
# Use VPC endpoints to avoid data transfer costs
# Configure VPC endpoint for logs.region.amazonaws.com
|
Monitoring and Alerting
Plugin Health Monitoring
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
#!/bin/bash
# check-cloudwatch-plugin.sh
if ! systemctl is-active --quiet logflux-cloudwatch; then
echo "CRITICAL: LogFlux CloudWatch plugin is not running"
exit 2
fi
# Check AWS connectivity
if ! aws logs describe-log-groups --region us-east-1 --max-items 1 &>/dev/null; then
echo "CRITICAL: Cannot connect to CloudWatch Logs API"
exit 2
fi
# Check recent log processing
if ! journalctl -u logflux-cloudwatch --since="10 minutes ago" | grep -q "events processed"; then
echo "WARNING: No events processed in last 10 minutes"
exit 1
fi
echo "OK: LogFlux CloudWatch plugin is healthy"
exit 0
|
CloudWatch Metrics Monitoring
1
2
3
4
5
6
7
8
9
10
11
12
13
|
# Monitor API usage
aws cloudwatch get-metric-statistics \
--namespace AWS/Logs \
--metric-name IncomingLogEvents \
--dimensions Name=LogGroupName,Value=/aws/lambda/my-function \
--start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S) \
--end-time $(date -u +%Y-%m-%dT%H:%M:%S) \
--period 300 \
--statistics Sum
# Monitor API throttling
aws logs describe-metric-filters \
--log-group-name /aws/lambda/my-function
|
Common Use Cases
AWS Lambda Monitoring
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
# Lambda function monitoring
cloudwatch:
log_groups:
- "/aws/lambda/api-handler"
- "/aws/lambda/data-processor"
- "/aws/lambda/auth-service"
filter_pattern: '[timestamp, request_id, level="ERROR", ...]'
follow: true
poll_interval: 30s
logging:
labels:
service: lambda
environment: production
|
Microservices on ECS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
# ECS service monitoring
cloudwatch:
log_groups:
- "/ecs/user-service"
- "/ecs/order-service"
- "/ecs/payment-service"
follow: true
poll_interval: 20s
logging:
labels:
architecture: microservices
platform: ecs
|
Database Monitoring
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
# RDS and Aurora monitoring
cloudwatch:
log_groups:
- "/aws/rds/instance/prod-db/error"
- "/aws/rds/cluster/aurora-prod/audit"
- "/aws/rds/instance/prod-db/slowquery"
filter_pattern: "ERROR"
follow: true
logging:
labels:
service: database
tier: data
|
API Gateway Monitoring
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
# API Gateway access logs
cloudwatch:
log_groups:
- "/aws/apigateway/prod-api"
- "/aws/apigateway/stage-api"
# Monitor 4xx and 5xx errors
filter_pattern: '[timestamp, request_id, ip, user, timestamp, method, resource, protocol, status>=400, ...]'
follow: true
poll_interval: 30s
logging:
labels:
service: api_gateway
log_type: access
|
Security Considerations
IAM Best Practices
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:DescribeLogGroups",
"logs:GetLogEvents",
"logs:FilterLogEvents"
],
"Resource": [
"arn:aws:logs:us-east-1:123456789012:log-group:/aws/lambda/*",
"arn:aws:logs:us-east-1:123456789012:log-group:/aws/apigateway/*"
]
}
]
}
|
Network Security
1
2
3
4
5
6
7
8
9
10
|
# VPC endpoint for CloudWatch Logs
aws ec2 create-vpc-endpoint \
--vpc-id vpc-12345678 \
--service-name com.amazonaws.us-east-1.logs \
--route-table-ids rtb-12345678
# Security group for VPC endpoint
aws ec2 create-security-group \
--group-name cloudwatch-logs-endpoint \
--description "Security group for CloudWatch Logs VPC endpoint"
|
Credential Management
1
2
3
4
5
6
7
|
# Use IAM roles instead of access keys
aws:
region: us-east-1
# No credentials - use IAM role
# Rotate credentials regularly if using access keys
# Store credentials in AWS Secrets Manager or Parameter Store
|
Troubleshooting
Common Issues
Authentication Failures:
1
2
3
4
5
6
7
8
9
10
11
|
# Check AWS credentials
aws sts get-caller-identity
# Test CloudWatch access
aws logs describe-log-groups --region us-east-1 --max-items 1
# Check IAM permissions
aws iam simulate-principal-policy \
--policy-source-arn arn:aws:iam::123456789012:role/LogFluxRole \
--action-names logs:DescribeLogGroups \
--resource-arns "arn:aws:logs:us-east-1:123456789012:*"
|
No Logs Retrieved:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
# Verify log group exists
aws logs describe-log-groups \
--log-group-name-prefix "/aws/lambda/my-function" \
--region us-east-1
# Check log group has recent data
aws logs describe-log-streams \
--log-group-name "/aws/lambda/my-function" \
--order-by LastEventTime \
--descending \
--max-items 5
# Test filter pattern
aws logs filter-log-events \
--log-group-name "/aws/lambda/my-function" \
--start-time 1642679850000 \
--filter-pattern "ERROR"
|
Rate Limiting:
1
2
3
4
5
6
7
8
|
# Check CloudWatch Logs quotas
aws service-quotas get-service-quota \
--service-code logs \
--quota-code L-F50550BC # GetLogEvents rate
# Increase poll interval
cloudwatch:
poll_interval: 60s # Reduce API frequency
|
High Costs:
1
2
3
4
5
6
7
8
9
10
11
|
# Optimize for cost
cloudwatch:
# Use specific log groups
log_groups:
- "/aws/lambda/critical-function"
# Apply filters to reduce data transfer
filter_pattern: "ERROR"
# Increase poll interval
poll_interval: 300s # 5 minutes
|
Debugging
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
# Enable verbose logging
sudo systemctl edit logflux-cloudwatch
# Add:
[Service]
Environment="LOGFLUX_LOG_LEVEL=debug"
# Monitor API calls
aws logs describe-log-groups --debug
# Check plugin logs
sudo journalctl -u logflux-cloudwatch -f
# Test connectivity
telnet logs.us-east-1.amazonaws.com 443
|
Best Practices
Configuration Management
- Use IAM roles instead of access keys when possible
- Apply filter patterns to reduce costs and noise
- Monitor specific log groups rather than broad patterns
- Set appropriate poll intervals based on log volume
- Optimize batch sizes for your log volume
- Use regional optimization - run in same region as log groups
- Implement VPC endpoints to reduce data transfer costs
- Monitor CloudWatch API quotas and adjust accordingly
Security
- Follow least privilege principle for IAM permissions
- Use VPC endpoints for private connectivity
- Rotate credentials regularly if using access keys
- Monitor API access through CloudTrail
Cost Management
- Use filter patterns to reduce data retrieval
- Target specific log streams when possible
- Adjust poll intervals based on requirements
- Monitor CloudWatch costs in AWS Billing
Disclaimer
Amazon Web Services, AWS, CloudWatch, and the AWS logo are trademarks of Amazon.com, Inc. or its affiliates. LogFlux is not affiliated with, endorsed by, or sponsored by Amazon Web Services, Inc. The AWS services and logos are referenced solely for identification purposes to indicate compatibility with AWS CloudWatch Logs.
Next Steps