22 Commits

Author SHA1 Message Date
anonpenguin23
1faf04e2a3 feat(cli): add function enable/disable and fix upgrade re-exec
- Add `enable` and `disable` commands to manage function status
- Implement process re-exec in the upgrade orchestrator to ensure
  Phase 4 config generation uses the newly-installed binary version
  (fixes bugboard #15)
2026-05-25 10:25:04 +03:00
anonpenguin23
b2d35bbde1 feat(gateway): enable local wildcard triggers for pubsub
- wire PubSubDispatcher to host functions to support local wildcard
  triggers for WASM-published topics
- implement batch deduplication by topic to prevent redundant trigger
  invocations and bound fan-out
- propagate trigger depth through function invocations to maintain
  recursion limits during local dispatch
2026-05-25 09:34:01 +03:00
anonpenguin23
98dad46a81 fix(gateway): decouple webrtc route registration from legacy flag
- Update route registration logic to rely solely on SFUPort > 0, resolving a silent 404 issue where gateways with valid SFU configurations were incorrectly disabled.
- Retain WebRTCEnabled in config for backward compatibility with existing operator YAML and request schemas.
- Add unit tests to pin registration behavior and prevent future regressions.
2026-05-24 20:56:08 +03:00
anonpenguin23
62e4d1963b feat(gateway): add apns_voip provider support
- register "apns_voip" provider to handle PushKit/CallKit signals
- implement target provider filtering in dispatcher to prevent cross-talk
  between alert and VoIP push paths
- add comprehensive tests to ensure backward compatibility for fan-out
  and correct filtering behavior
2026-05-24 19:38:38 +03:00
anonpenguin23
3b8139802c feat: APNs silent-drop guard + persistent-WS mid-session JWT refresh
#348 - APNs silent-drop guard
Apple's APNs silently returns HTTP 200 for pushes with no visible
content (no title, no body, no badge, no sound, no
content-available=1) and then drops them — which looked to the WASM
caller like a successful delivery. Now rejected up-front with the new
push.ErrEmptyContent sentinel, and the APNs provider returns the
structured push.PushError shape (HTTPStatus, Reason, Unregistered,
Wrapped) so the dispatcher can branch on Unregistered to remove dead
tokens automatically. Legacy ErrDeviceUnregistered sentinel is
preserved for errors.Is compatibility (wrapped inside PushError).

Always logs APNs HTTP response (status, reason, apns_id, token prefix)
so future silent-drop classes show up in operator logs.

content-available is also now correctly mapped from snake_case
Data["content_available"] (any truthy variant) into Apple's
canonical "content-available": 1 inside the aps dictionary.

#321 - mid-session JWT refresh on persistent WS
Long-lived persistent WS connections used to have to close+reconnect
when the JWT rolled — losing per-instance state, message queues, and
subscriptions. The handler now accepts an "auth.refresh" control
frame: client sends the new token, the gateway re-verifies it via
the new JWTVerifier interface, updates the per-instance invCtx
in-place (persistent.Instance.UpdateInvCtx), and acks. No close, no
state loss.

JWTVerifier is optional — handlers set it via SetJWTVerifier at
gateway init. When unwired the handler nack's with a "not supported
on this gateway" response and clients fall back to the old
close+reconnect path, so older deploys don't break.

Other:
- push/dispatcher.go: SendToUserDetailed returns per-device PushError
  shape so callers can act on Unregistered / HTTPStatus / Reason.
- serverless/hostfunctions/push.go: WASM host functions for the new
  detailed-error shape.
- serverless/persistent/instance.go: UpdateInvCtx mid-session.

Tests:
- ws_persistent_control_test.go: auth.refresh ack/nack paths.
- apns_test.go: empty-content rejection, PushError shape on 410 +
  generic non-200, content-available mapping.
- dispatcher_detailed_test.go: SendToUserDetailed result shape.
- instance_update_invctx_test.go: invCtx update is per-instance, not
  cross-tenant.

VERSION bumped to 0.122.27.
2026-05-19 18:19:21 +03:00
anonpenguin23
ebc9d51167 feat(gateway): implement pubsub dispatcher and batch query support
- Integrate PubSubDispatcher to enable libp2p subscription for trigger patterns
- Add BatchQuery to rqlite client to reduce round-trips for multi-query operations
- Implement lifecycle management for dispatcher and add safety limits for batch queries
2026-05-17 16:27:05 +03:00
anonpenguin23
17b06d38e4 fix(gateway,serverless): libp2p mesh peer-port + system-trigger auth bypass
Two serious bugs found via cross-node behavior observation:

1. libp2p peer-discovery published wrong port
   PeerDiscovery's multiaddr was using the gateway's HTTP API port (e.g.
   10004), not the actual libp2p TCP port. Remote gateways dialed that
   port, hit the HTTP server, received 400, and failed the libp2p
   multistream handshake ("message did not have trailing newline").
   Result: cluster-wide cross-node libp2p mesh had 0 connected peers
   and cross-node pubsub silently dropped 100% of messages.

   The libp2p port is OS-assigned at startup (client.go uses
   /ip4/0.0.0.0/tcp/0). It's not anywhere in cfg — it's only on
   host.Addrs(). Fix: drop the listenPort field from PeerDiscovery
   entirely and derive the port live from host.Addrs() via
   extractLibp2pTCPPort. WG IP still comes from getWireGuardIP
   (libp2p filters its own enumeration so WG IPs don't appear in
   host.Addrs(), but the listener is bound 0.0.0.0 so the port is
   reachable on the WG interface).

2. System triggers silently blocked by CanInvoke (#264)
   Cron, pubsub, database, timer, and job triggers all fire from
   gateway-internal state with no caller identity. Invoke() ran every
   request through CanInvoke(callerWallet) which returned false for
   the empty wallet — every fire returned ErrUnauthorized. Reported as
   a cron firing every minute with "unauthorized" for 19+ hours.

   Auth boundary for system triggers belongs at REGISTRATION time
   (POST /v1/functions/{name}/triggers, deploy-time auto-register
   from function.yaml). Skip the per-invocation check for system
   trigger types; user-driven triggers (HTTP, WebSocket) still gate
   on caller identity as before.

Tests:
- gateway/peer_discovery_test.go covers extractLibp2pTCPPort.
- serverless/invoke_system_trigger_test.go covers the bypass and the
  user-trigger gate.

VERSION bumped to 0.122.25.
2026-05-16 15:43:18 +03:00
anonpenguin23
251630a5c7 fix(serverless): per-call invCtx propagation prevents cross-tenant identity leak in persistent WS
HostFunctions is a process-wide singleton (one per gateway engine).
Its `invCtx` field is shared across all WASM instances. For STATELESS
execution the executor sets/clears it per-call but the lock is
released before WASM runs — two concurrent invocations can race on
the field and one's host call can read the other's identity. Window
is microseconds.

For PERSISTENT WS the bug was much worse: invCtx used to be bound
ONCE at instantiation and reused for the connection's lifetime. Two
simultaneous persistent WS connections from different namespaces /
wallets overwrote each other's invCtx, and EVERY subsequent
function_invoke / GetCallerJWTSubject / GetCallerWallet / GetSecret
call from inside the WASM read whatever was bound LAST. Result:
silent identity leak across tenants for as long as the connections
overlapped.

Fix: per-call invCtx propagation through Go's context.Context.
wazero passes the ctx given to api.Function.Call through to host
function callbacks, so every WASM-host hop carries its own invCtx.

- pkg/serverless/invocation_context.go (new): WithInvocationContext +
  InvocationContextFromCtx helpers using an unexported invCtxKey.
- pkg/serverless/hostfunctions/invocation_context.go (new):
  currentInvocationContext(ctx) — ctx-attached invCtx wins over the
  singleton field.
- All host accessors (FunctionInvoke, GetEnv, GetSecret, GetRequestID,
  GetCallerWallet, GetWSClientID, GetCallerClaim, GetCallerJWTSubject)
  now route through currentInvocationContext(ctx).
- pkg/serverless/persistent/instance.go: every export call's ctx is
  wrapped with the per-instance invCtx before being passed to wazero.
- pkg/gateway/handlers/serverless/ws_persistent_handler.go: invCtx is
  built per-frame and attached to ctx, not stored on a shared field.
- pkg/serverless/engine.go: removed the SetInvocationContext call at
  InstantiatePersistent (no longer needed; ctx carries it).

Stateless still uses the singleton field — its race is latent since
the host-functions split and migrating it is a separate scoped
change.

Tests:
- hostfunctions/invocation_context_test.go covers ctx-wins-over-singleton.
- gateway/handlers/serverless/ws_persistent_handler_test.go covers the
  per-frame ctx wiring.
- cli/functions/build_test.go is new coverage for the build path
  touched in this change.

VERSION bumped to 0.122.24.
2026-05-15 13:36:35 +03:00
anonpenguin23
a0a1decd06 fix(ws): prefer X-Forwarded-Host in Origin check — root cause #240/#249
handleNamespaceGatewayRequest rewrites r.Host to the backend target
IP:port (e.g. "10.0.0.6:10004") before forwarding. The original
public host (e.g. "ns-anchat-test.orama-devnet.network") is preserved
in X-Forwarded-Host. checkWSOrigin in both pubsub/ws_client.go and
serverless/ws_handler.go was comparing the client's Origin against
the proxied r.Host only — so every browser / RN-iOS WS upgrade was
rejected 403 because their Origin's public hostname can never match
10.0.0.6.

curl probes don't send Origin, so curl returned true unconditionally
and the bug was invisible to operator smoke tests. AnChat's iPhone
WS clients hit `code=1006 reason="Received bad response code from
server: 403"` for ~24h.

Fix: prefer X-Forwarded-Host (the original public host) when present,
fall back to r.Host for direct (non-proxied) connections. Applied
identically to both WS handlers. Regression test in
serverless/ws_origin_test.go covers the proxy-hop case, no-Origin
case, and direct-connection case.

This is the real fix; v0.122.19 only closed a separate silent-forward
auth hole that produced opaque 401s on a different code path.

VERSION bumped to 0.122.20.
2026-05-15 07:03:28 +03:00
anonpenguin23
872c553d1c fix(gateway): namespace-proxy rejects unauthed requests at main, logs WS audit
Root-cause hardening for bug #240 and #249's "intermittent 401 over WS"
reports. handleNamespaceGatewayRequest previously had a third code
path beyond "auth ok" and "auth error": when validateAuthForNamespaceProxy
returned empty namespace AND empty error (i.e. "no credentials found"),
the request fell through to a silent forward to the namespace gateway
WITHOUT internal-auth headers. The namespace gateway then rejected
with 401 "missing API key" in ~60µs.

From the client's perspective: opaque 401.
From our side: only the namespace gateway logged it, and that tier
can't validate API keys (they live in the main cluster RQLite), so
the operator had no signal that the main gateway had even seen the
request. AnChat's intermittent 401-on-WS reports went unsolved for
this exact reason.

Fix:
- Explicit reject at main when no credentials extracted AND path
  isn't public. Returns 401 with WWW-Authenticate: Bearer realm and a
  clear message naming the three accepted credential sources.
- Rich structured logging on every WS upgrade auth outcome: presence
  of api_key/token/jwt query params, Authorization + X-API-Key
  headers, Connection/Upgrade headers, Origin, User-Agent, client IP,
  raw query length. Steady-state stays low-noise: success path logs at
  debug, reject paths log at warn.
- Namespace-mismatch reject (existing branch) now also logs.

VERSION bumped to 0.122.19.
2026-05-14 17:53:38 +03:00
anonpenguin23
07638354d2 feat(#72): full-privacy push — self-hosted ntfy + APNs-direct provider
Migration 028: namespace_push_credentials
- Per-(namespace, provider) AES-256-GCM encrypted credential blob.
- Generic schema — apns/ntfy/expo/future plug in with zero migration.
- Separated from migration 026's namespace_push_config (preferences vs
  credentials, different access patterns).

pkg/push/credentials
- Manager + Registry + RQLite store; HKDF purpose "namespace-push-credentials"
  via pkg/secrets. Provider Validator interface for per-provider schema.

pkg/push/providers/apns
- Apple Push Notification service direct provider (no Expo proxy).
- Validator + dispatcher; credentials are p8 signing key + key_id + team_id.

pkg/push/providers/ntfy/credentials.go
- ntfy credential schema (auth_token + default topic). Used both with
  the public ntfy.sh and our self-hosted instance.

pkg/environments/production/installers/ntfy.go
- Self-hosted ntfy server installer. Binary, system user, hardened
  /etc/ntfy/server.yml, systemd unit. Listens on 127.0.0.1:NtfyListenPort
  only — Caddy is the only public path.

pkg/environments/production/installers/caddy.go
- Emit reverse_proxy block for push.<dnsZone> -> 127.0.0.1:NtfyListenPort
  when operator enables ntfy on a node.

CLI: install/upgrade orchestrators learn a new "ntfy" install/preserve
phase; flag gating in install/flags.go + upgrade/flags.go.

Gateway handlers/push/credentials_handler.go
- GET/PUT/DELETE /v1/namespace/push-credentials/{provider}.
- PUT validates against provider Validator before encrypting and storing.
- GET returns a redacted view (booleans + non-secret fields only).

Push manager: provider resolution now also consults
namespace_push_credentials before falling back to YAML defaults.

Docs: core/docs/PUSH_NOTIFICATIONS.md walks through end-to-end setup.

VERSION bumped to 0.122.14.
2026-05-14 10:48:00 +03:00
anonpenguin23
fda47533c3 feat: per-namespace rate-limit self-service + WS JWT auth + release 0.122.12
Per-namespace rate-limit config (feature #69)
- Migration 027: new `namespace_rate_limit_config` table
  (namespace PK, requests_per_minute, burst, audit metadata).
- pkg/ratelimit: Manager + RQLite ConfigStore + types. Same pattern
  as the push config in bug #220's follow-up — LRU cache, invalidate
  on PUT/DELETE, falls back to YAML defaults when no row exists.
- pkg/gateway/handlers/ratelimit: GET/PUT/DELETE /v1/namespace/rate-limit.
  PUT requests are rejected if they exceed the operator's configured
  ceiling (MaxRequestsPerMinute / MaxBurst) — tenants self-serve but
  cannot raise their quota past the cap.
- pkg/gateway/rate_limiter.go: per-namespace lookup, default fallback.
- pkg/gateway/middleware.go: WS JWT middleware (middleware_ws_jwt_test.go).
- pkg/gateway/auth/service.go: refresh-token rotation hardening with
  regression test in refresh_rotation_test.go.

AI agent instructions
- Add AGENTS.md, CLAUDE.md, .github/copilot-instructions.md (DeBros v0.2.0
  baseline).

DeBros rules bumped to v0.2.0 (sha bb6e6ef).

VERSION bumped to 0.122.12.
2026-05-13 15:41:36 +03:00
anonpenguin23
91774de465 fix(gateway): update rqlite consistency level and improve column mapping
- Change RQLite consistency level from `none` to `weak` to ensure reads
  route to the leader and prevent stale data reads (fixes #235)
- Add `normalizeColumnKey` to allow snake_case SQL columns to map to
  CamelCase Go struct fields automatically (fixes #65)
- Add comprehensive unit tests for DSN generation and column mapping
2026-05-12 09:13:03 +03:00
anonpenguin23
f55c7269cd feat(gateway): implement self-service tenant push notifications
- Add `namespace_push_config` table for per-namespace provider settings
- Introduce `cluster_secret_path` to enable deterministic JWT signing and
  AES-256-GCM encryption for push credentials
- Update gateway config to support per-namespace overrides of push
  notification providers (ntfy/Expo)
- Bump version to 0.122.3
2026-05-08 11:23:53 +03:00
anonpenguin23
b5f6fb4497 docs: update deployment and serverless documentation
- bump version to 0.122.2
- document schema migration invariants and push notification configuration
- add serverless host function aliases and v2 database API documentation
- introduce schema roundtrip test to prevent migration drift
2026-05-07 07:33:52 +03:00
anonpenguin23
4cce4bd97b feat(migrations): implement schema version contract enforcement
- Add `contract.go` to manage and validate embedded SQL migrations
- Introduce `AssertSchema` to verify database version at startup
- Include `SchemaMismatchError` with actionable recovery instructions
- Add comprehensive unit tests for version parsing and validation
2026-05-06 08:23:13 +03:00
0f42816a78 etc 2026-05-05 11:35:35 +03:00
anonpenguin23
604ce221d5 feat(gateway): implement persistent webhooks and namespace sequencing
- Add migrations for per-namespace publish sequences and persistent WebSocket function settings
- Integrate PersistentWSManager and WSBridge into the gateway dependency graph
- Upgrade serverless engine to use a multi-tier rate limiter
- Update JWT claims to support custom application-defined fields
2026-05-04 11:38:19 +03:00
anonpenguin23
9225215ed3 feat(core): implement sni-router for stealth turn
- add `orama-sni-router` binary to build process
- introduce `cmd/sni-router` for TLS-level SNI routing
- add documentation for stealth turn deployment architecture
2026-05-03 18:20:21 +03:00
anonpenguin23
8c4e18908b feat(auth): integrate rootwallet agent and update service hardening
- Replace CLI-based rootwallet calls with agent-based communication
- Update production provisioner to support sudo-based service management
- Add API key-to-wallet resolution for gateway operator handlers
2026-03-28 08:59:11 +02:00
anonpenguin23
fe4823dbba feat(cli): add node management and rollout commands
- implement `nodes`, `rollout`, `ssh`, and `status` commands
- add `migrate-conf` utility to register existing nodes with the gateway
- update database schema to support operator wallet tracking for nodes
2026-03-27 16:25:32 +02:00
anonpenguin23
86fe0588b9 refactor: move Go project into core/ for monorepo structure 2026-03-26 18:14:52 +02:00