mirror of
https://github.com/DeBrosOfficial/orama.git
synced 2026-03-17 22:46:57 +00:00
- Document WireGuard IPv6 disable, service auth, token security, process isolation - Introduce OramaOS architecture, enrollment flow, and management via Gateway API - Add troubleshooting for RQLite/Olric auth, OramaOS LUKS/enrollment issues
195 lines
7.6 KiB
Markdown
195 lines
7.6 KiB
Markdown
# Security Hardening
|
|
|
|
This document describes all security measures applied to the Orama Network, covering both Phase 1 (service hardening on existing Ubuntu nodes) and Phase 2 (OramaOS locked-down image).
|
|
|
|
## Phase 1: Service Hardening
|
|
|
|
These measures apply to all nodes (Ubuntu and OramaOS).
|
|
|
|
### Network Isolation
|
|
|
|
**CIDR Validation (Step 1.1)**
|
|
- WireGuard subnet restricted to `10.0.0.0/24` across all components: firewall rules, rate limiter, auth module, and WireGuard PostUp/PostDown iptables rules
|
|
- Prevents other tenants on shared VPS providers from bypassing the firewall via overlapping `10.x.x.x` ranges
|
|
|
|
**IPv6 Disabled (Step 1.2)**
|
|
- IPv6 disabled system-wide via sysctl: `net.ipv6.conf.all.disable_ipv6=1`
|
|
- Prevents services bound to `0.0.0.0` from being reachable via IPv6 (which had no firewall rules)
|
|
|
|
### Authentication
|
|
|
|
**Internal Endpoint Auth (Step 1.3)**
|
|
- `/v1/internal/wg/peers` and `/v1/internal/wg/peer/remove` now require cluster secret validation
|
|
- Peer removal additionally validates the request originates from a WireGuard subnet IP
|
|
|
|
**RQLite Authentication (Step 1.7)**
|
|
- RQLite runs with `-auth` flag pointing to a credentials file
|
|
- All RQLite HTTP requests include `Authorization: Basic <base64>` headers
|
|
- Credentials generated at cluster genesis, distributed to joining nodes via join response
|
|
- Both the central RQLite client wrapper and the standalone CoreDNS RQLite client send auth
|
|
|
|
**Olric Gossip Encryption (Step 1.8)**
|
|
- Olric memberlist uses a 32-byte encryption key for all gossip traffic
|
|
- Key generated at genesis, distributed via join response
|
|
- Prevents rogue nodes from joining the gossip ring and poisoning caches
|
|
- Note: encryption is all-or-nothing (coordinated restart required when enabling)
|
|
|
|
**IPFS Cluster TrustedPeers (Step 1.9)**
|
|
- IPFS Cluster `TrustedPeers` populated with actual cluster peer IDs (was `["*"]`)
|
|
- New peers added to TrustedPeers on all existing nodes during join
|
|
- Prevents unauthorized peers from controlling IPFS pinning
|
|
|
|
**Vault V1 Auth Enforcement (Step 1.14)**
|
|
- V1 push/pull endpoints require a valid session token when vault-guardian is configured
|
|
- Previously, auth was optional for backward compatibility — any WG peer could read/overwrite Shamir shares
|
|
|
|
### Token & Key Storage
|
|
|
|
**Refresh Token Hashing (Step 1.5)**
|
|
- Refresh tokens stored as SHA-256 hashes in RQLite (never plaintext)
|
|
- On lookup: hash the incoming token, query by hash
|
|
- On revocation: hash before revoking (both single-token and by-subject)
|
|
- Existing tokens invalidated on upgrade (users re-authenticate)
|
|
|
|
**API Key Hashing (Step 1.6)**
|
|
- API keys stored as HMAC-SHA256 hashes using a server-side secret
|
|
- HMAC secret generated at cluster genesis, stored in `~/.orama/secrets/api-key-hmac-secret`
|
|
- On lookup: compute HMAC, query by hash — fast enough for every request (unlike bcrypt)
|
|
- In-memory cache uses raw key as cache key (never persisted)
|
|
- During rolling upgrade: dual lookup (HMAC first, then raw as fallback) until all nodes upgraded
|
|
|
|
**TURN Secret Encryption (Step 1.15)**
|
|
- TURN shared secrets encrypted at rest in RQLite using AES-256-GCM
|
|
- Encryption key derived via HKDF from the cluster secret with purpose string `"turn-encryption"`
|
|
|
|
### TLS & Transport
|
|
|
|
**InsecureSkipVerify Fix (Step 1.10)**
|
|
- During node join, TLS verification uses TOFU (Trust On First Use)
|
|
- Invite token output includes the CA certificate fingerprint (SHA-256)
|
|
- Joining node verifies the server cert fingerprint matches before proceeding
|
|
- After join: CA cert stored locally for future connections
|
|
|
|
**WebSocket Origin Validation (Step 1.4)**
|
|
- All WebSocket upgraders validate the `Origin` header against the node's configured domain
|
|
- Non-browser clients (no Origin header) are still allowed
|
|
- Prevents cross-site WebSocket hijacking attacks
|
|
|
|
### Process Isolation
|
|
|
|
**Dedicated User (Step 1.11)**
|
|
- All services run as the `orama` user (not root)
|
|
- Caddy and CoreDNS get `AmbientCapabilities=CAP_NET_BIND_SERVICE` for ports 80/443 and 53
|
|
- WireGuard stays as root (kernel netlink requires it)
|
|
- vault-guardian already had proper hardening
|
|
|
|
**systemd Hardening (Step 1.12)**
|
|
- All service units include:
|
|
```ini
|
|
ProtectSystem=strict
|
|
ProtectHome=yes
|
|
NoNewPrivileges=yes
|
|
PrivateDevices=yes
|
|
ProtectKernelTunables=yes
|
|
ProtectKernelModules=yes
|
|
RestrictNamespaces=yes
|
|
ReadWritePaths=/opt/orama/.orama
|
|
```
|
|
- Applied to both template files (`pkg/environments/templates/`) and hardcoded unit generators (`pkg/environments/production/services.go`)
|
|
|
|
### Supply Chain
|
|
|
|
**Binary Signing (Step 1.13)**
|
|
- Build archives include `manifest.sig` — a rootwallet EVM signature of the manifest hash
|
|
- During install, the signature is verified against the embedded Orama public key
|
|
- Unsigned or tampered archives are rejected
|
|
|
|
## Phase 2: OramaOS
|
|
|
|
These measures apply only to OramaOS nodes (mainnet, devnet, testnet).
|
|
|
|
### Immutable OS
|
|
|
|
- **Read-only rootfs** — SquashFS with dm-verity integrity verification
|
|
- **No shell** — `/bin/sh` symlinked to `/bin/false`, no bash/ash/ssh
|
|
- **No SSH** — OpenSSH not included in the image
|
|
- **Minimal packages** — only what's needed for systemd, cryptsetup, and the agent
|
|
|
|
### Full-Disk Encryption
|
|
|
|
- **LUKS2** with AES-XTS-Plain64 on the data partition
|
|
- **Shamir's Secret Sharing** over GF(256) — LUKS key split across peer vault-guardians
|
|
- **Adaptive threshold** — K = max(3, N/3) where N is the number of peers
|
|
- **Key zeroing** — LUKS key wiped from memory immediately after use
|
|
- **Malicious share detection** — fetch K+1 shares when possible, verify consistency
|
|
|
|
### Service Sandboxing
|
|
|
|
Each service runs in isolated Linux namespaces:
|
|
- **CLONE_NEWNS** — mount namespace (filesystem isolation)
|
|
- **CLONE_NEWUTS** — hostname namespace
|
|
- **Dedicated UID/GID** — each service has its own user
|
|
- **Seccomp filtering** — per-service syscall allowlist
|
|
|
|
Note: CLONE_NEWPID is intentionally omitted — it makes services PID 1 in their namespace, which changes signal semantics (SIGTERM ignored by default for PID 1).
|
|
|
|
### Signed Updates
|
|
|
|
- A/B partition scheme with systemd-boot and boot counting (`tries_left=3`)
|
|
- All updates signed with rootwallet EVM signature (secp256k1 + keccak256)
|
|
- Signer address: `0xb5d8a496c8b2412990d7D467E17727fdF5954afC`
|
|
- P2P distribution over WireGuard between nodes
|
|
- Automatic rollback on 3 consecutive boot failures
|
|
|
|
### Zero Operator Access
|
|
|
|
- Operators cannot read data on the machine (LUKS encrypted, no shell)
|
|
- Management only through Gateway API → agent over WireGuard
|
|
- All commands are logged and auditable
|
|
- No root access, no console access, no file system access
|
|
|
|
## Rollout Strategy
|
|
|
|
### Phase 1 Batches
|
|
|
|
```
|
|
Batch 1 (zero-risk, no restart):
|
|
- CIDR fix
|
|
- IPv6 disable
|
|
- Internal endpoint auth
|
|
- WebSocket origin check
|
|
|
|
Batch 2 (medium-risk, restart needed):
|
|
- Hash refresh tokens
|
|
- Hash API keys
|
|
- Binary signing
|
|
- Vault V1 auth enforcement
|
|
- TURN secret encryption
|
|
|
|
Batch 3 (high-risk, coordinated rollout):
|
|
- RQLite auth (followers first, leader last)
|
|
- Olric encryption (simultaneous restart)
|
|
- IPFS Cluster TrustedPeers
|
|
|
|
Batch 4 (infrastructure changes):
|
|
- InsecureSkipVerify fix
|
|
- Dedicated user
|
|
- systemd hardening
|
|
```
|
|
|
|
### Phase 2
|
|
|
|
1. Build and test OramaOS image in QEMU
|
|
2. Deploy to sandbox cluster alongside Ubuntu nodes
|
|
3. Verify interop and stability
|
|
4. Gradual migration: testnet → devnet → mainnet (one node at a time, maintaining Raft quorum)
|
|
|
|
## Verification
|
|
|
|
All changes verified on sandbox cluster before production deployment:
|
|
|
|
- `make test` — all unit tests pass
|
|
- `orama monitor report --env sandbox` — full cluster health
|
|
- Manual endpoint testing (e.g., curl without auth → 401)
|
|
- Security-specific checks (IPv6 listeners, RQLite auth, binary signatures)
|