mirror of
https://github.com/DeBrosOfficial/orama.git
synced 2026-06-17 03:34:13 +00:00
Root cause of the recurring "turn.credentials → namespace_not_configured" on a
distant node: at converge the gateway resolves its TURN secret from the
namespace rqlite, and on a slow/just-restarted node that read fails ONCE, so
the gateway is written with TURN disabled. Removing the node is not a fix — the
software must tolerate a slow read.
Two-part fix (complements e7ed718's "don't blank a warm config"):
- RETRY the secret read (5×2s) at converge so a node whose rqlite is still
syncing waits for it to land instead of writing an empty block once. A
genuine decrypt failure still exhausts the retries → unresolved → the
running config is preserved.
- CACHE the resolved secret into the node's own cluster-state.json
(applyResolvedWebRTCToState), so the NEXT cold start reads it from disk —
chooseRestoreWebRTC is state-first and short-circuits before the DB. The
state struct already had TURNSharedSecret "for cold start" but nothing
populated it; now it's filled on every successful resolve (only rewritten
on change). Each node self-heals its own cache; nothing new is sent
cross-node.
cluster-state.json now carries the TURN secret, so both writers (local
saveLocalState and the remote SaveClusterState) are tightened to 0600 + chmod.
Stale-secret self-heals: disable/enable webrtc re-pushes every node's config
and the next converge re-caches the new value.
Dual-reviewed: code-quality APPROVED; security SECURE after the remote-write
0600 fix. Tests: cache populate + short-circuit, no-change, turn-only node.