AWS CloudFront Logs Integration
Analyze CloudFront access logs in Qorrelate
Overview
Amazon CloudFront is a content delivery network (CDN) service. This integration allows you to collect CloudFront access logs from S3 and forward them to Qorrelate for analysis, helping you understand traffic patterns, identify errors, and monitor CDN performance.
Prerequisites
- CloudFront distribution with logging enabled
- S3 bucket for CloudFront logs
- AWS Lambda execution role with S3 read permissions
- Your Qorrelate API endpoint and organization ID
1. Enable CloudFront Logging
Enable standard logging on your CloudFront distribution:
# Using AWS CLI
aws cloudfront update-distribution \
--id YOUR_DISTRIBUTION_ID \
--distribution-config '{
"Logging": {
"Enabled": true,
"Bucket": "your-logs-bucket.s3.amazonaws.com",
"Prefix": "cloudfront-logs/",
"IncludeCookies": false
}
}'
2. Create Lambda Function
Create a Lambda function to process CloudFront logs and forward to Qorrelate:
import boto3
import gzip
import json
import os
import requests
from datetime import datetime
s3 = boto3.client('s3')
# CloudFront log field names
CF_FIELDS = [
'date', 'time', 'x_edge_location', 'sc_bytes', 'c_ip',
'cs_method', 'cs_host', 'cs_uri_stem', 'sc_status',
'cs_referer', 'cs_user_agent', 'cs_uri_query', 'cs_cookie',
'x_edge_result_type', 'x_edge_request_id', 'x_host_header',
'cs_protocol', 'cs_bytes', 'time_taken', 'x_forwarded_for',
'ssl_protocol', 'ssl_cipher', 'x_edge_response_result_type',
'cs_protocol_version', 'fle_status', 'fle_encrypted_fields',
'c_port', 'time_to_first_byte', 'x_edge_detailed_result_type',
'sc_content_type', 'sc_content_len', 'sc_range_start', 'sc_range_end'
]
def lambda_handler(event, context):
logs = []
for record in event['Records']:
bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
# Get and decompress log file
response = s3.get_object(Bucket=bucket, Key=key)
content = gzip.decompress(response['Body'].read()).decode('utf-8')
# Parse log lines (skip comments)
for line in content.split('\n'):
if line.startswith('#') or not line.strip():
continue
fields = line.split('\t')
if len(fields) >= 19:
log_data = dict(zip(CF_FIELDS[:len(fields)], fields))
# Parse timestamp
timestamp = f"{log_data['date']}T{log_data['time']}Z"
logs.append({
"timestamp": timestamp,
"body": json.dumps(log_data),
"severity_text": "ERROR" if int(log_data.get('sc_status', 200)) >= 400 else "INFO",
"attributes": {
"source": "aws-cloudfront",
"edge_location": log_data.get('x_edge_location'),
"status_code": log_data.get('sc_status'),
"method": log_data.get('cs_method'),
"uri": log_data.get('cs_uri_stem'),
"client_ip": log_data.get('c_ip'),
"time_taken_ms": log_data.get('time_taken')
}
})
if logs:
# Forward to Qorrelate
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {os.environ['QORRELATE_API_KEY']}",
"X-Organization-Id": os.environ["QORRELATE_ORG_ID"]
}
# Send in batches of 100
for i in range(0, len(logs), 100):
batch = logs[i:i+100]
requests.post(
f"{os.environ['QORRELATE_ENDPOINT']}/v1/logs",
headers=headers,
json={"logs": batch}
)
return {'statusCode': 200, 'body': f'Processed {len(logs)} log entries'}
3. Configure S3 Event Trigger
Set up S3 to trigger the Lambda function when new logs arrive:
# Using AWS CLI
aws s3api put-bucket-notification-configuration \
--bucket your-logs-bucket \
--notification-configuration '{
"LambdaFunctionConfigurations": [
{
"LambdaFunctionArn": "arn:aws:lambda:region:account:function:cloudfront-to-qorrelate",
"Events": ["s3:ObjectCreated:*"],
"Filter": {
"Key": {
"FilterRules": [
{"Name": "prefix", "Value": "cloudfront-logs/"},
{"Name": "suffix", "Value": ".gz"}
]
}
}
}
]
}'
4. Set Environment Variables
# Lambda environment variables
QORRELATE_API_KEY=your_api_key
QORRELATE_ORG_ID=your_organization_id
QORRELATE_ENDPOINT=https://qorrelate.io
CloudFront Real-time Logs (Alternative)
For real-time analysis, use CloudFront real-time logs with Kinesis:
# Create Kinesis stream
aws kinesis create-stream \
--stream-name cloudfront-realtime-logs \
--shard-count 1
# Create real-time log config
aws cloudfront create-realtime-log-config \
--name qorrelate-realtime \
--sampling-rate 100 \
--fields "timestamp" "c-ip" "sc-status" "cs-uri-stem" \
--endpoint-stream-arn "arn:aws:kinesis:region:account:stream/cloudfront-realtime-logs"
Useful Queries in Qorrelate
Once logs are flowing, try these queries:
# Find all 4xx/5xx errors
source:aws-cloudfront AND status_code:[400 TO 599]
# Analyze slow requests (> 1 second)
source:aws-cloudfront AND time_taken_ms:>1000
# Find requests from specific edge location
source:aws-cloudfront AND edge_location:IAD*
# Track specific URI patterns
source:aws-cloudfront AND uri:/api/*
Verifying the Integration
- Deploy the Lambda function with proper IAM permissions
- Configure S3 event notification
- Make some requests through your CloudFront distribution
- Wait 5-10 minutes for CloudFront to write logs to S3
- View logs in Qorrelate filtered by
source:aws-cloudfront
💡 Pro Tip
CloudFront logs are delivered to S3 multiple times per hour. For more real-time visibility, consider using CloudFront real-time logs with Kinesis, which delivers logs within seconds.