Notification Queue Design | Ordering, Fanout, Backpressure & DLQ

Fanout Pattern (Pub/Sub)

A single event (e.g., "New Post") needs to trigger multiple downstream actions: send Push Notification, update Search Index, and Archive to Data Lake.

Publisher

→

Topic: events.post

↘ → ↗

Consumer A (Push)

Consumer B (Search)

Consumer C (Analytics)

Implementation: Use Consumer Groups in Kafka. Each service (Push, Search) is a separate Consumer Group. Kafka replicates the message to all groups, but within a group, only one instance processes it.

Guaranteed Message Ordering

Strict global ordering limit scalability. We usually need ordering per entity (e.g., all messages for User A must process in order).

Partition Key Strategy

When publishing to Kafka/Kinesis, use user_id as the Partition Key.

All events for user_123 go to Partition 1.
Consumer 1 reads Partition 1 strictly FIFO.
Result: User A's "Create" event always processes before "Delete".

⚠️ The Pitfall

If you use threads in your consumer (Parallel Processing), you lose ordering guarantees within that shard.

Fix: Use consistent hashing within the consumer or single-threaded processing per partition.

Backpressure & Throttling

What if the Notification Service produces 10k msg/sec, but the SMS Provider (Twilio) allows only 100/sec?

Strategy	How it works	Trade-off
Rate Limiting (Token Bucket)	Consumer checks a local/Redis bucket before making API call.	Simple, effective for API limits.
Prefetch Count	(RabbitMQ) Consumer only pulls N messages at a time. It won't take more until it ACKs.	Prevents consumer memory overflow.
Delay Queues	If rate limited, push message to a separate "Delay Queue" (TTL 10s) to retry later.	Adds latency but ensures eventual delivery.

Reliability: DLQ & Poison Pills

A "Poison Pill" is a message that always crashes the consumer (e.g., malformed JSON). If you retry 5 times, it crashes 5 times.

Dead Letter Queue (DLQ) Pattern

Consumer reads message.
Tries to process → Fails.
Retry N times (Exponential Backoff).
If still fails → ACK the message on main queue, but Publish it to a DLQ-Topic.
Alert engineers to inspect the DLQ manually.

# Python Consumer Skeleton
def process_message(msg):
    retries = 0
    while retries < 3:
        try:
            do_work(msg)
            return
        except Exception:
            retries += 1
            time.sleep(2 ** retries) # Exponential backoff
    
    # Failed after retries -> Send to DLQ
    produce_to_dlq(msg)
    log_error(f"Moved message {msg.id} to DLQ")

Summary

Use Fanout to decouple producers from multiple downstream consumers.
Guarantee Ordering by using Partition Keys (Consumer Group per Partition).
Handle Backpressure using Prefetch limits and consistent Rate Limiters.
Never block the pipe: move failing messages to a Dead Letter Queue.

Notification Queue Internals