Referral & Affiliate System Design | Attribution Models & Fraud Detection

High-Level Architecture

The goal is to accurately map a Visitor to a Referrer.

Alice shares Link
/ref/alice123

→

Bob Clicks
(Cookie Set: alice123)

→

Bob Signs Up
(Read Cookie)

→

Attribute Referral
referrer_id = alice

Generating Referral Codes

Do not use User IDs (e.g., `?ref=1054`) directly. It leaks growth metrics. Use random or encoded strings.

import secrets
import string

def generate_referral_code(length=8):
    # Base62 alphabet (0-9, a-z, A-Z)
    alphabet = string.ascii_letters + string.digits
    return ''.join(secrets.choice(alphabet) for _ in range(length))

# Output: "Xk92mB7z"
# URL: example.com/register?ref=Xk92mB7z

Attribution Models

If a user clicks multiple links (Alice's -> Charlie's), who gets the credit?

Model	Logic	Pros	Cons
Last Click (Standard)	Latest referrer cookie overwrites previous.	Rewards the closer. Simple to implement.	Ignores "awareness" drivers.
First Click	First referrer cookie is locked for `N` days.	Rewards the introducer.	Frustrating for recent influencers.

Window Lookback: Typically 30-day or 90-day cookie expiration.

Fraud Detection

Users will try to refer themselves (Self-referral) to get double rewards.

🚫 Basic Checks

Same IP Address?
Same Device Fingerprint?
Referrer Email == New User Email?

🕵️ Advanced Checks

Fuzzy Name Matching ("John Doe" vs "John Doe 2")
Payment Method Deduplication (Same credit card?)
Velocity Checks (10 referrals in 1 min?)

Summary

Unique Codes: Use random strings, allowing users to customize aliases.
Cookies: Use 1st-party cookies to persist attribution through sign-up.
Fraud: Block the reward, not the sign-up. Let them register, but flag the referral as "Investigate".