Skip to content

Webhook Delivery Reliability Design

Issue: #5 Date: 2026-05-05

Goal

Replace fire-and-forget webhook delivery with retries, delivery logging, HMAC signatures, and a status API. Users get confidence that webhooks are delivered and tools to debug failures.

Approach

Separate async task per webhook (Approach 2). Each webhook gets its own retry loop via asyncio.create_task, so one slow/failing webhook doesn't block others. In-process retries — no external job queue.

Database Changes

New table: webhook_deliveries

CREATE TABLE webhook_deliveries (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  webhook_id UUID NOT NULL REFERENCES webhooks(id) ON DELETE CASCADE,
  hit_id UUID NOT NULL REFERENCES hits(id) ON DELETE CASCADE,
  status_code INT,
  success BOOLEAN NOT NULL,
  attempt INT NOT NULL DEFAULT 1,
  error_message TEXT,
  delivered_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_webhook_deliveries_webhook_id ON webhook_deliveries(webhook_id);
CREATE INDEX idx_webhook_deliveries_hit_id ON webhook_deliveries(hit_id);

One row per delivery attempt. Dead letter = webhook_id+hit_id with max attempt=3 and success=false.

Alter api_keys: add webhook_secret

ALTER TABLE api_keys ADD COLUMN webhook_secret TEXT;

32 random hex chars, generated alongside the API key. Stored plaintext (server needs it to sign outgoing webhooks).

Retry Logic

Background Task Architecture

Current: record_hit_and_notify does everything sequentially.

New: 1. record_hit_and_notify inserts hit (captures hit_id), queries webhooks + owning key's webhook_secret, then spawns one asyncio.create_task(deliver_webhook(...)) per webhook. 2. deliver_webhook(webhook_id, webhook_url, payload, hit_id, webhook_secret) handles retry: - Attempt POST with 10s timeout, include X-Webhook-Signature header - Success: log to webhook_deliveries, done - Failure: log to webhook_deliveries, sleep (backoff), retry - Backoff: attempt 2 after 1s, attempt 3 after 5s - After 3 failures: stop (dead letter)

HMAC Signature

Each outgoing webhook POST includes:

X-Webhook-Signature: sha256=<hex_digest>

Computed as hmac.new(webhook_secret.encode(), payload_bytes, hashlib.sha256).hexdigest().

  • DB key links: webhook_secret from the api_keys row (via links.api_key_id)
  • Master key links (api_key_id is NULL): derive from hashlib.sha256(API_KEY.encode()).hexdigest()[:64]

Data Flow for Webhook Query

Current query: SELECT webhook_url FROM webhooks WHERE link_id = ?

New: also need webhooks.id (for deliveries FK) and the owning key's webhook_secret. Two options: - Join through links.api_key_id → api_keys.webhook_secret - Separate query for the secret after getting the link

Use a separate query: fetch the link's api_key_id (already available from the link lookup), then query api_keys.webhook_secret if api_key_id is not null.

Delivery Status API

GET /{short_id}/webhooks/status

Requires auth (X-API-Key). Master key can view any link. DB key can only view links it owns.

Response:

{
  "webhooks": [
    {
      "webhook_url": "https://example.com/hook",
      "deliveries": [
        {
          "hit_id": "uuid",
          "attempts": [
            {"attempt": 1, "success": false, "status_code": 500, "error_message": null, "delivered_at": "..."},
            {"attempt": 2, "success": true, "status_code": 200, "error_message": null, "delivered_at": "..."}
          ]
        }
      ]
    }
  ]
}

Returns last 50 deliveries per webhook. No cursor pagination for MVP.

Dead Letter

No separate table. Dead letters = rows in webhook_deliveries where attempt = 3 AND success = false. The status API surfaces these naturally.

HMAC Secret Lifecycle

  • Create: Generated with API key (secrets.token_hex(32)), stored in api_keys.webhook_secret
  • Returned: In ApiKeyResponse (create) and ApiKeyRotateResponse (rotate)
  • Rotation: New secret generated on key rotation. Old signatures stop working immediately.
  • Master key links: Derive from hashlib.sha256(API_KEY.encode()).hexdigest()[:64]
  • Verification by receivers: Compare X-Webhook-Signature header against HMAC-SHA256 of raw body using their webhook_secret

Changes to Existing Code

record_hit_and_notify

  • Capture hit_id from hit insert
  • Query webhooks.id in addition to webhook_url
  • Look up webhook_secret (from api_keys or derived from master key)
  • Spawn asyncio.create_task(deliver_webhook(...)) per webhook instead of inline POST

POST /api-keys

  • Generate webhook_secret alongside key
  • Store in DB, return in response

POST /api-keys/{key_id}/rotate

  • Generate new webhook_secret
  • Update in DB, return in response

Response models

  • ApiKeyResponse: add webhook_secret: str
  • ApiKeyRotateResponse: add webhook_secret: str

Testing

  • deliver_webhook unit tests: success on first try, retry then success, 3 failures (dead letter), timeout, HMAC correctness
  • record_hit_and_notify tests: updated for new spawn-per-webhook pattern
  • GET /{short_id}/webhooks/status tests: auth, ownership, 404, response shape
  • Mock strategy: patch asyncio.sleep to avoid delays, mock httpx responses per attempt

Out of Scope

  • Webhook URL verification/validation on registration
  • Configurable retry count or backoff per user
  • Webhook event types (currently only "hit" events exist)