Cross-service handoffs

Inside one database, Postgres can make the business write and the work row atomic. Across services, that guarantee ends.

Use a transactional outbox at the producer, transport between services, and idempotent consumption at the receiver. The handoff is explicit: durable intent on your side, at-least-once delivery in the middle, dedupe on theirs.

When the commit boundary ends

Inside one database, the story is clean: one commit can update an order and enqueue the follow-up work. The moment work leaves that database or deployable unit, you no longer have one transaction. You have a handoff.

From that point on, you need transport: a way for intent to cross from your Postgres into someone else’s process. A boundary is anywhere ownership changes: a different database, a different deploy unit, a broker, or an external API.

Use an outbox relay on the producer and an inbox on the receiver. The transport in between only delivers bytes; it does not replace leases or claim loops on either side.

What each part can guarantee

Your database can be strongly consistent inside its boundary. The broker and search index cannot. “Paid” and “visible in search” are different guarantees with different systems behind them.

Transport and downstream consumers are always eventual. Only the producer database gets ACID on write.

Layer	What holds	Consistency	Postgres tools
Inside your DB	Business row + outbox/inbox row commit together. Claims are exclusive	Strong (ACID, row locks)	Transactional outbox, `SKIP LOCKED`, leases
Your worker layer	Each row processed at least once. Optional per-key FIFO	At-least-once + idempotent effects	Hash ring, heartbeats, ordering guard, fencing
Transport	Message or HTTP delivery may retry, reorder (across keys), or lag	Eventual, at-least-once typical	Broker, webhook, outbox relay
Downstream consumer	Their read model, inbox, or side effect catches up later	Eventual: bounded lag is the SLO	Their idempotency + your contract

Cross-boundary architecture

Guarantees stop at each service boundary. Make the handoff explicit: durable intent on your side, idempotent consumption on theirs, and transport in between that may retry.

Three zones: producer DB (ACID), transport (at-least-once), consumer (catches up later).

Producer service

API / Domain logic

business mutation

same transaction

orders + outbox

atomic commit

Relay worker

claim · lease · publish

claim loop

Transport (shared infra)

Message broker / stream

routing · delivery · fan-out

HTTP webhook

Partner REST API

at-least-once · retries ·

no distributed 2PC

handoff

Downstream service

Webhook endpoint

→ INSERT inbox

inbox

same pattern, their DB

Their workers

or stream consumer → inbox

Read model / index

eventually consistent

Each box is a separate consistency boundary. Handoffs need explicit contracts and idempotent consumers, not a shared transaction across services.

Transactional outbox for `order:9182`

Make the intent durable. When payment succeeds, you need search to know eventually. You cannot call search inside the transaction. You can insert an outbox row that says “publish OrderPaid” in the same commit as UPDATE orders.

The Transactional Outbox pattern is the bridge: write intent in the same transaction as the domain mutation, then let a relay worker publish to your transport. In this implementation, the relay is the same competing-consumers loop from earlier chapters, applied to outbound events. You get:

Atomic intent — Order marked paid + OrderPaid outbox row: one commit
Relay — Worker claims outbox row, publishes to transport
At-least-once publish — Crash after publish, before complete → retry (idempotent publish key)
Consumer dedupes — Subscriber uses event_id / idempotency key

BEGIN;
UPDATE orders SET status = 'paid' WHERE id = 9182;  -- business fact the API must not lie about
INSERT INTO outbox (partition_key, event_type, payload, idempotency_key)
VALUES (
  'order:9182',
  'OrderPaid',
  '{"order_id":9182,"paid_at":"2026-07-01T12:00:00Z"}'::jsonb,
  'evt-order-9182-paid-v1'
);  -- durable publish intent, not the wire publish
COMMIT;  -- both rows exist or neither. No ghost OrderPaid events

How search learns the order paid

Payment for order:9182 lives in your database. Search does not. Three ways teams bridge that:

Sync HTTP in the request: a partner outage becomes your outage, and retries duplicate work without idempotency. Avoid this at the service boundary.
Transactional outbox + relay: publish intent in the same transaction as UPDATE orders. Delivery is at-least-once; the consumer dedupes; lag is measurable. Use when search or analytics must update after payment.
CDC from the WAL: no application change on the write path, but streams physical row changes rather than domain events and breaks easily on schema refactors. Use for legacy databases you cannot modify.

Choosing a transport

After the outbox row is durable, choose transport based on the consumer, the SLO, and who operates the middleware.

Kind	Good when	Optimizes for	Breaks when
Message broker	Service-to-service delivery, routing keys, ops-owned middleware	Fast handoff and per-queue routing	You need long retention replay for many independent consumers
Event log / stream	Many subscribers, replay, high-volume shared log	Fan-out and retention	Point-to-point only with one consumer team
Managed queue	Cloud-native, minimal ops, push to one consumer group	Managed durability and scaling	Strict per-key ordering without paying for FIFO SKUs
HTTP / webhooks	One partner or SaaS exposes a URL	Simple integration surface	Partner outage blocks your relay unless you queue and retry
Poll / RPC pull	No push. Batch export or client-pull APIs	Receiver controls cadence	Low latency SLOs without aggressive polling
CDC	Downstream needs row changes, not domain events	Decoupling from app publish code	Schema refactors and domain event contracts

Inbound: webhooks and broker consumers become inbox rows

When Stripe or a partner POSTs to you, acknowledge fast, persist durably, and process asynchronously. Their retries are your at-least-once delivery.

Return 200 after the row is inserted. Processing runs in the claim loop, not in the HTTP handler.

// POST /webhooks/stripe
await db.query(
  `
  INSERT INTO inbox (partition_key, payload, idempotency_key)
  VALUES ($1, $2, $3)
  ON CONFLICT (idempotency_key) WHERE idempotency_key IS NOT NULL DO NOTHING  -- partner retry hits here, not in handler
`,
  [event.accountId, body, event.id],
);

return res.status(200).send(); // ack after durable row. worker claims async

Verify signatures at the boundary. Return 2xx only after the row is durable. Their retries hit ON CONFLICT DO NOTHING on enqueue.

Contracts across boundaries

Shared transactions are gone. Shared vocabulary is not. When two teams own two databases, coordination becomes documentation plus stable ids. The table below is the minimum contract for OrderPaid on order:9182 to land safely in search.

Contract piece	Producer owes	Consumer owes
`event_id` / idempotency key	Stable, unique per logical event	Dedupe store or unique constraint
`partition_key` / message key	Same key for all ordered events in a stream	Single consumer per key (or ordering guard)
Schema version	Backward-compatible changes. Bump version on breaks	Reject or dead-letter unknown versions
Delivery semantics	Document at-least-once. No silent drops	Idempotent handlers. Expose lag metrics
Replay	Retention policy or archive for re-publish	Safe to re-process historical events

Sagas and long flows across services

Some workflows outlive one request. Charge card, reserve inventory, ship, notify search. That is many local commits linked by events, not one distributed lock. Sagas are durable rows chained over time. Each step must tolerate retry.

Do

Idempotent steps, explicit timeouts, compensations as first-class outbox events, dashboards on end-to-end lag (order paid → search indexed).

Avoid

Avoid synchronous chains across five HTTP calls in one request. Do not assume partner callbacks arrive exactly once, and do not block the user response on downstream indexing.

Explaining lag to product and leadership

Payment in your DB can be immediate while search and analytics catch up on a separate timeline. Use this distinction when someone asks why search lags payment by thirty seconds.

User-facing write path. Strong in your DB: “Payment recorded” is true when API returns success.
Derived views. Eventual. Search, analytics, recommendations catch up in seconds (or minutes: define SLO).
External systems. Eventual with retries. A webhook may arrive twice. Partner ledger may lag. Contract defines max delay. Same mechanics whether the other party is another team or Stripe.
Failure mode: At-least-once everywhere outside ACID. Duplicates are normal. Idempotency is not optional.

Where this implementation fits

Match the tool to the constraint. Postgres coordinates work inside a service. Brokers and queues help at the boundary when fan-out, retention, or cross-team delivery require them.

Strong fit

Durable work with claims, leases, and retries
Postgres already in the stack (or inbox/outbox tables are acceptable)
Per-entity ordering or partition affinity matters
Producers and workers share a database. Or outbox/inbox at each service boundary
Backlog depth and stuck rows should be visible in SQL

Weaker fit: consider other tools

Shared event log with long retention and many independent replay consumers → Kafka / Pulsar as platform infra
Fan-out streaming is the product, not per-service claim coordination
Jobs-only background work, no custom ordering → pg-boss, Graphile Worker, or River
No database on either end can participate in inbox/outbox
Postgres is saturated after tuning and coordination itself must move off-DB

Source

Use the article for explanation, then use these files when you want the complete SQL and TypeScript in one place.