- Add `turn_stealth_domain` to gateway config for stealth TURN support
- Introduce `turn_discovery` in `sni-router` to auto-discover per-namespace routes
- Add database migration to enable stealth TURN per namespace
- Document ephemeral state API in `SERVERLESS.md`
- split webrtc route gating into `webrtcServeTURNCredentials` and `webrtcServeSFURoutes` to allow non-SFU gateways to mint TURN credentials
- update `chooseRestoreWebRTC` to correctly resolve configurations for nodes without local SFU ports
- add unit tests to verify independent route registration logic (bugboard #25)
- Update route registration logic to rely solely on SFUPort > 0, resolving a silent 404 issue where gateways with valid SFU configurations were incorrectly disabled.
- Retain WebRTCEnabled in config for backward compatibility with existing operator YAML and request schemas.
- Add unit tests to pin registration behavior and prevent future regressions.
#348 - APNs silent-drop guard
Apple's APNs silently returns HTTP 200 for pushes with no visible
content (no title, no body, no badge, no sound, no
content-available=1) and then drops them — which looked to the WASM
caller like a successful delivery. Now rejected up-front with the new
push.ErrEmptyContent sentinel, and the APNs provider returns the
structured push.PushError shape (HTTPStatus, Reason, Unregistered,
Wrapped) so the dispatcher can branch on Unregistered to remove dead
tokens automatically. Legacy ErrDeviceUnregistered sentinel is
preserved for errors.Is compatibility (wrapped inside PushError).
Always logs APNs HTTP response (status, reason, apns_id, token prefix)
so future silent-drop classes show up in operator logs.
content-available is also now correctly mapped from snake_case
Data["content_available"] (any truthy variant) into Apple's
canonical "content-available": 1 inside the aps dictionary.
#321 - mid-session JWT refresh on persistent WS
Long-lived persistent WS connections used to have to close+reconnect
when the JWT rolled — losing per-instance state, message queues, and
subscriptions. The handler now accepts an "auth.refresh" control
frame: client sends the new token, the gateway re-verifies it via
the new JWTVerifier interface, updates the per-instance invCtx
in-place (persistent.Instance.UpdateInvCtx), and acks. No close, no
state loss.
JWTVerifier is optional — handlers set it via SetJWTVerifier at
gateway init. When unwired the handler nack's with a "not supported
on this gateway" response and clients fall back to the old
close+reconnect path, so older deploys don't break.
Other:
- push/dispatcher.go: SendToUserDetailed returns per-device PushError
shape so callers can act on Unregistered / HTTPStatus / Reason.
- serverless/hostfunctions/push.go: WASM host functions for the new
detailed-error shape.
- serverless/persistent/instance.go: UpdateInvCtx mid-session.
Tests:
- ws_persistent_control_test.go: auth.refresh ack/nack paths.
- apns_test.go: empty-content rejection, PushError shape on 410 +
generic non-200, content-available mapping.
- dispatcher_detailed_test.go: SendToUserDetailed result shape.
- instance_update_invctx_test.go: invCtx update is per-instance, not
cross-tenant.
VERSION bumped to 0.122.27.
- Integrate PubSubDispatcher to enable libp2p subscription for trigger patterns
- Add BatchQuery to rqlite client to reduce round-trips for multi-query operations
- Implement lifecycle management for dispatcher and add safety limits for batch queries
Two serious bugs found via cross-node behavior observation:
1. libp2p peer-discovery published wrong port
PeerDiscovery's multiaddr was using the gateway's HTTP API port (e.g.
10004), not the actual libp2p TCP port. Remote gateways dialed that
port, hit the HTTP server, received 400, and failed the libp2p
multistream handshake ("message did not have trailing newline").
Result: cluster-wide cross-node libp2p mesh had 0 connected peers
and cross-node pubsub silently dropped 100% of messages.
The libp2p port is OS-assigned at startup (client.go uses
/ip4/0.0.0.0/tcp/0). It's not anywhere in cfg — it's only on
host.Addrs(). Fix: drop the listenPort field from PeerDiscovery
entirely and derive the port live from host.Addrs() via
extractLibp2pTCPPort. WG IP still comes from getWireGuardIP
(libp2p filters its own enumeration so WG IPs don't appear in
host.Addrs(), but the listener is bound 0.0.0.0 so the port is
reachable on the WG interface).
2. System triggers silently blocked by CanInvoke (#264)
Cron, pubsub, database, timer, and job triggers all fire from
gateway-internal state with no caller identity. Invoke() ran every
request through CanInvoke(callerWallet) which returned false for
the empty wallet — every fire returned ErrUnauthorized. Reported as
a cron firing every minute with "unauthorized" for 19+ hours.
Auth boundary for system triggers belongs at REGISTRATION time
(POST /v1/functions/{name}/triggers, deploy-time auto-register
from function.yaml). Skip the per-invocation check for system
trigger types; user-driven triggers (HTTP, WebSocket) still gate
on caller identity as before.
Tests:
- gateway/peer_discovery_test.go covers extractLibp2pTCPPort.
- serverless/invoke_system_trigger_test.go covers the bypass and the
user-trigger gate.
VERSION bumped to 0.122.25.
Migration 028: namespace_push_credentials
- Per-(namespace, provider) AES-256-GCM encrypted credential blob.
- Generic schema — apns/ntfy/expo/future plug in with zero migration.
- Separated from migration 026's namespace_push_config (preferences vs
credentials, different access patterns).
pkg/push/credentials
- Manager + Registry + RQLite store; HKDF purpose "namespace-push-credentials"
via pkg/secrets. Provider Validator interface for per-provider schema.
pkg/push/providers/apns
- Apple Push Notification service direct provider (no Expo proxy).
- Validator + dispatcher; credentials are p8 signing key + key_id + team_id.
pkg/push/providers/ntfy/credentials.go
- ntfy credential schema (auth_token + default topic). Used both with
the public ntfy.sh and our self-hosted instance.
pkg/environments/production/installers/ntfy.go
- Self-hosted ntfy server installer. Binary, system user, hardened
/etc/ntfy/server.yml, systemd unit. Listens on 127.0.0.1:NtfyListenPort
only — Caddy is the only public path.
pkg/environments/production/installers/caddy.go
- Emit reverse_proxy block for push.<dnsZone> -> 127.0.0.1:NtfyListenPort
when operator enables ntfy on a node.
CLI: install/upgrade orchestrators learn a new "ntfy" install/preserve
phase; flag gating in install/flags.go + upgrade/flags.go.
Gateway handlers/push/credentials_handler.go
- GET/PUT/DELETE /v1/namespace/push-credentials/{provider}.
- PUT validates against provider Validator before encrypting and storing.
- GET returns a redacted view (booleans + non-secret fields only).
Push manager: provider resolution now also consults
namespace_push_credentials before falling back to YAML defaults.
Docs: core/docs/PUSH_NOTIFICATIONS.md walks through end-to-end setup.
VERSION bumped to 0.122.14.
Per-namespace rate-limit config (feature #69)
- Migration 027: new `namespace_rate_limit_config` table
(namespace PK, requests_per_minute, burst, audit metadata).
- pkg/ratelimit: Manager + RQLite ConfigStore + types. Same pattern
as the push config in bug #220's follow-up — LRU cache, invalidate
on PUT/DELETE, falls back to YAML defaults when no row exists.
- pkg/gateway/handlers/ratelimit: GET/PUT/DELETE /v1/namespace/rate-limit.
PUT requests are rejected if they exceed the operator's configured
ceiling (MaxRequestsPerMinute / MaxBurst) — tenants self-serve but
cannot raise their quota past the cap.
- pkg/gateway/rate_limiter.go: per-namespace lookup, default fallback.
- pkg/gateway/middleware.go: WS JWT middleware (middleware_ws_jwt_test.go).
- pkg/gateway/auth/service.go: refresh-token rotation hardening with
regression test in refresh_rotation_test.go.
AI agent instructions
- Add AGENTS.md, CLAUDE.md, .github/copilot-instructions.md (DeBros v0.2.0
baseline).
DeBros rules bumped to v0.2.0 (sha bb6e6ef).
VERSION bumped to 0.122.12.
- Add `namespace_push_config` table for per-namespace provider settings
- Introduce `cluster_secret_path` to enable deterministic JWT signing and
AES-256-GCM encryption for push credentials
- Update gateway config to support per-namespace overrides of push
notification providers (ntfy/Expo)
- Bump version to 0.122.3
- Add migrations for per-namespace publish sequences and persistent WebSocket function settings
- Integrate PersistentWSManager and WSBridge into the gateway dependency graph
- Upgrade serverless engine to use a multi-tier rate limiter
- Update JWT claims to support custom application-defined fields
- implement `nodes`, `rollout`, `ssh`, and `status` commands
- add `migrate-conf` utility to register existing nodes with the gateway
- update database schema to support operator wallet tracking for nodes