High-Level Architecture (Direct Upload)
Avoid uploading files to your backend API servers. It wastes bandwidth and blocks threads. Upload directly to Object Storage (S3).
S3 Presigned URLs
A secure way to grant temporary upload access without exposing AWS credentials.
✅ Benefits
- Scalability: Your servers don't handle file data.
- Security: URL expires quickly (e.g., 5 mins).
- Access Control: You can restrict file type (MIME) and max size.
❌ Limitations
- Complexity: Client needs to make 2 requests (Get URL + Upload).
- CORS: Need to configure S3 CORS correctly.
Python (Boto3) Implementation
import boto3
from botocore.exceptions import ClientError
def generate_presigned_url(bucket_name, object_name, expiration=300):
"""Generate a presigned URL to share an S3 object"""
s3_client = boto3.client('s3')
try:
response = s3_client.generate_presigned_url('put_object',
Params={'Bucket': bucket_name,
'Key': object_name,
'ContentType': 'image/jpeg'},
ExpiresIn=expiration)
except ClientError as e:
print(e)
return None
return response
# Usage
url = generate_presigned_url('my-bucket', 'uploads/user1/avatar.jpg')
print(f"Upload here: {url}")
Large Files: Multipart Upload
For files > 100MB, standard upload fails if connection drops. Use Multipart Upload.
| Feature | Standard Upload | Multipart Upload |
|---|---|---|
| Method | Single PUT request | Split into chunks (e.g., 5MB parts) |
| Failure Recovery | Restart from 0% | Retry only failed chunks |
| Speed | Linear | Parallel uploads (faster) |
| Use Case | Avatars, Documents (<100MB) | Videos, Large Archives (>100MB) |
Security Best Practices
Virus Scanning
Trigger a Lambda function on S3 upload to scan with ClamAV. Quarantine malicious files immediately.
Validate MIME Types
Don't trust the file extension. Inspect the "Magic Bytes" (file signature) to ensure it's a valid image/pdf.
Private Bucket
Keep the S3 bucket Private. Only serve public content via CloudFront (CDN) or Signed URLs for private files.
Media Processing Pipeline
Never serve raw user uploads. They are usually unoptimized (5MB 4K image for a 50px avatar).
- Trigger: S3
ObjectCreatedevent sends message to SQS. - Worker: Consumes SQS, downloads image.
- Process: Resizes (thumbnails), optimizes (WebP/AVIF), strips metadata (EXIF).
- Save: Uploads processed versions to a "Public/Processed" bucket.
- Serve: Application uses URL of processed version.
Summary
- Use Presigned URLs for direct client-to-S3 uploads (scalable).
- Implement Multipart Uploads for large files to support retries and parallel chunks.
- Scan every file for Viruses asynchronously.
- Serve content via CDN (CloudFront/Cloudflare) for speed and caching.
- Never override the original file; store processed versions separately.