You update an order and publish order.paid. Two systems, two writes, no shared transaction. Whatever you do, there is a window where one succeeds and the other doesn’t — and now your database and your broker disagree.
The dual-write trap
Commit the row, then publish: the process can die before the publish. Publish, then commit: the transaction can roll back after the event is already gone. There is no ordering that saves you, because they are not one atomic act.
Don’t publish from the request. Write the event into an outbox table inside the same transaction as your domain change. One commit, both facts. A separate relay ships it to the broker afterwards.
BEGIN;
UPDATE orders SET status = 'paid' WHERE id = $1;
INSERT INTO outbox (aggregate_id, type, payload)
VALUES ($1, 'order.paid', $2);
COMMIT; -- both rows commit together, or neither doesThe relay
- Poll (or tail the WAL for) unsent outbox rows in order
- Publish to the broker with at-least-once delivery
- Mark rows sent only after the broker acks
- Consumers dedupe by event id — at-least-once means retries happen
Tail the WAL instead of polling and the outbox costs the source database nothing — no triggers, no extra read load. That is exactly how Argus streams audit events.
Distributed systems are trust problems in disguise. The outbox turns "I hope it published" into "it’s in the same commit."
Slower? Marginally. Correct? Completely. I’ll take the trade every time.