diff --git a/.cursor/plans/dynamic-ec358e91.plan.md b/.cursor/plans/dynamic-ec358e91.plan.md new file mode 100644 index 0000000..1d428d9 --- /dev/null +++ b/.cursor/plans/dynamic-ec358e91.plan.md @@ -0,0 +1,165 @@ + +# Dynamic Database Clustering — Implementation Plan + +### Scope + +Implement the feature described in `DYNAMIC_DATABASE_CLUSTERING.md`: decentralized metadata via libp2p pubsub, dynamic per-database rqlite clusters (3-node default), idle hibernation/wake-up, node failure replacement, and client UX that exposes `cli.Database(name)` with app namespacing. + +### Guiding Principles + +- Reuse existing `pkg/pubsub` and `pkg/rqlite` where practical; avoid singletons. +- Backward-compatible config migration with deprecations, feature-flag controlled rollout. +- Strong eventual consistency (vector clocks + periodic gossip) over centralized control planes. +- Tests and observability at each phase. + +### Phase 0: Prep & Scaffolding + +- Add feature flag `dynamic_db_clustering` (env/config) → default off. +- Introduce config shape for new `database` fields while supporting legacy fields (soft deprecated). +- Create empty packages and interfaces to enable incremental compilation: + - `pkg/metadata/{types.go,manager.go,pubsub.go,consensus.go,vector_clock.go}` + - `pkg/dbcluster/{manager.go,lifecycle.go,subprocess.go,ports.go,health.go,metrics.go}` +- Ensure rqlite subprocess availability (binary path detection, `scripts/install-debros-network.sh` update if needed). +- Establish CI jobs for new unit/integration suites and longer-running e2e. + +### Phase 1: Metadata Layer (No hibernation yet) + +- Implement metadata types and store (RW locks, versioning) inside `pkg/rqlite/metadata.go`: + - `DatabaseMetadata`, `NodeCapacity`, `PortRange`, `MetadataStore`. +- Pubsub schema and handlers inside `pkg/rqlite/pubsub.go` using existing `pkg/pubsub` bridge: + - Topic `/debros/metadata/v1`; messages for create request/response/confirm, status, node capacity, health. +- Consensus helpers inside `pkg/rqlite/consensus.go` and `pkg/rqlite/vector_clock.go`: + - Deterministic coordinator (lowest peer ID), vector clocks, merge rules, periodic full-state gossip (checksums + fetch diffs). +- Reuse existing node connectivity/backoff; no new ping service required. +- Skip unit tests for now; validate by wiring e2e flows later. + +### Phase 2: Database Creation & Client API + +- Port management: + - `PortManager` with bind-probing, random allocation within configured ranges; local bookkeeping. +- Subprocess control: + - `RQLiteInstance` lifecycle (start, wait ready via /status and simple query, stop, status). +- Cluster manager: + - `ClusterManager` keeps `activeClusters`, listens to metadata events, executes creation protocol, readiness fan-in, failure surfaces. +- Client API: + - Update `pkg/client/interface.go` to include `Database(name string)`. + - Implement app namespacing in `pkg/client/client.go` (sanitize app name + db name). + - Backoff polling for readiness during creation. +- Data isolation: + - Data dir per db: `./data/_/rqlite` (respect node `data_dir` base). +- Integration tests: create single db across 3 nodes; multiple databases coexisting; cross-node read/write. + +### Phase 3: Hibernation & Wake-Up + +- Idle detection and coordination: + - Track `LastQuery` per instance; periodic scan; all-nodes-idle quorum → coordinated shutdown schedule. +- Hibernation protocol: + - Broadcast idle notices, coordinator schedules `DATABASE_SHUTDOWN_COORDINATED`, graceful SIGTERM, ports freed, status → `hibernating`. +- Wake-up protocol: + - Client detects `hibernating`, performs CAS to `waking`, triggers wake request; port reuse if available else re-negotiate; start instances; status → `active`. +- Client retry UX: + - Transparent retries with exponential backoff; treat `waking` as wait-only state. +- Tests: hibernation under load; thundering herd; resource verification and persistence across cycles. + +### Phase 4: Resilience (Failure & Replacement) + +- Continuous health checks with timeouts → mark node unhealthy. +- Replacement orchestration: + - Coordinator initiates `NODE_REPLACEMENT_NEEDED`, eligible nodes respond, confirm selection, new node joins raft via `-join` then syncs. +- Startup reconciliation: + - Detect and cleanup orphaned or non-member local data directories. +- Rate limiting replacements to prevent cascades; prioritize by usage metrics. +- Tests: forced crashes, partitions, replacement within target SLO; reconciliation sanity. + +### Phase 5: Production Hardening & Optimization + +- Metrics/logging: + - Structured logs with trace IDs; counters for queries/min, hibernations, wake-ups, replacements; health and capacity gauges. +- Config validation, replication factor settings (1,3,5), and debugging APIs (read-only metadata dump, node status). +- Client metadata caching and query routing improvements (simple round-robin → latency-aware later). +- Performance benchmarks and operator-facing docs. + +### File Changes (Essentials) + +- `pkg/config/config.go` + - Remove (deprecate, then delete): `Database.DataDir`, `RQLitePort`, `RQLiteRaftPort`, `RQLiteJoinAddress`. + - Add: `ReplicationFactor int`, `HibernationTimeout time.Duration`, `MaxDatabases int`, `PortRange {HTTPStart, HTTPEnd, RaftStart, RaftEnd int}`, `Discovery.HealthCheckInterval`. +- `pkg/client/interface.go`/`pkg/client/client.go` + - Add `Database(name string)` and app namespace requirement (`DefaultClientConfig(appName)`); backoff polling. +- `pkg/node/node.go` + - Wire `metadata.Manager` and `dbcluster.ClusterManager`; remove direct rqlite singleton usage. +- `pkg/rqlite/*` + - Refactor to instance-oriented helpers from singleton. +- New packages under `pkg/metadata` and `pkg/dbcluster` as listed above. +- `configs/node.yaml` and validation paths to reflect new `database` block. + +### Config Example (target end-state) + +```yaml +node: + data_dir: "./data" + +database: + replication_factor: 3 + hibernation_timeout: 60 + max_databases: 100 + port_range: + http_start: 5001 + http_end: 5999 + raft_start: 7001 + raft_end: 7999 + +discovery: + health_check_interval: 10s +``` + +### Rollout Strategy + +- Keep feature flag off by default; support legacy single-cluster path. +- Ship Phase 1 behind flag; enable in dev/e2e only. +- Incrementally enable creation (Phase 2), then hibernation (Phase 3) per environment. +- Remove legacy config after deprecation window. + +### Testing & Quality Gates + +- Unit tests: metadata ops, consensus, ports, subprocess, manager state machine. +- Integration tests under `e2e/` for creation, isolation, hibernation, failure handling, partitions. +- Benchmarks for creation (<10s), wake-up (<8s), metadata sync (<5s), query overhead (<10ms). +- Chaos suite for randomized failures and partitions. + +### Risks & Mitigations (operationalized) + +- Metadata divergence → vector clocks + periodic checksums + majority read checks in client. +- Raft churn → adaptive timeouts; allow `always_on` flag per-db (future). +- Cascading replacements → global rate limiter and prioritization. +- Debuggability → verbose structured logging and metadata dump endpoints. + +### Timeline (indicative) + +- Weeks 1-2: Phases 0-1 +- Weeks 3-4: Phase 2 +- Weeks 5-6: Phase 3 +- Weeks 7-8: Phase 4 +- Weeks 9-10+: Phase 5 + +### To-dos + +- [ ] Add feature flag, scaffold packages, CI jobs, rqlite binary checks +- [ ] Extend `pkg/config/config.go` and YAML schemas; deprecate legacy fields +- [ ] Implement metadata types and thread-safe store with versioning +- [ ] Implement pubsub messages and handlers using existing pubsub manager +- [ ] Implement coordinator election, vector clocks, gossip reconciliation +- [ ] Implement `PortManager` with bind-probing and allocation +- [ ] Implement rqlite subprocess control and readiness checks +- [ ] Implement `ClusterManager` and creation lifecycle orchestration +- [ ] Add `Database(name)` and app namespacing to client; backoff polling +- [ ] Adopt per-database data dirs under node `data_dir` +- [ ] Integration tests for creation and isolation across nodes +- [ ] Idle detection, coordinated shutdown, status updates +- [ ] Wake-up CAS to `waking`, port reuse/renegotiation, restart +- [ ] Client transparent retry/backoff for hibernation and waking +- [ ] Health checks, replacement orchestration, rate limiting +- [ ] Implement orphaned data reconciliation on startup +- [ ] Add metrics and structured logging across managers +- [ ] Benchmarks for creation, wake-up, sync, query overhead +- [ ] Operator and developer docs; config and migration guides \ No newline at end of file diff --git a/DYNAMIC_CLUSTERING_GUIDE.md b/DYNAMIC_CLUSTERING_GUIDE.md new file mode 100644 index 0000000..eac217a --- /dev/null +++ b/DYNAMIC_CLUSTERING_GUIDE.md @@ -0,0 +1,504 @@ +# Dynamic Database Clustering - User Guide + +## Overview + +Dynamic Database Clustering enables on-demand creation of isolated, replicated rqlite database clusters with automatic resource management through hibernation. Each database runs as a separate 3-node cluster with its own data directory and port allocation. + +## Key Features + +✅ **Multi-Database Support** - Create unlimited isolated databases on-demand +✅ **3-Node Replication** - Fault-tolerant by default (configurable) +✅ **Auto Hibernation** - Idle databases hibernate to save resources +✅ **Transparent Wake-Up** - Automatic restart on access +✅ **App Namespacing** - Databases are scoped by application name +✅ **Decentralized Metadata** - LibP2P pubsub-based coordination +✅ **Failure Recovery** - Automatic node replacement on failures +✅ **Resource Optimization** - Dynamic port allocation and data isolation + +## Configuration + +### Node Configuration (`configs/node.yaml`) + +```yaml +node: + data_dir: "./data" + listen_addresses: + - "/ip4/0.0.0.0/tcp/4001" + max_connections: 50 + +database: + replication_factor: 3 # Number of replicas per database + hibernation_timeout: 60s # Idle time before hibernation + max_databases: 100 # Max databases per node + port_range_http_start: 5001 # HTTP port range start + port_range_http_end: 5999 # HTTP port range end + port_range_raft_start: 7001 # Raft port range start + port_range_raft_end: 7999 # Raft port range end + +discovery: + bootstrap_peers: + - "/ip4/127.0.0.1/tcp/4001/p2p/..." + discovery_interval: 30s + health_check_interval: 10s +``` + +### Key Configuration Options + +#### `database.replication_factor` (default: 3) +Number of nodes that will host each database cluster. Minimum 1, recommended 3 for fault tolerance. + +#### `database.hibernation_timeout` (default: 60s) +Time of inactivity before a database is hibernated. Set to 0 to disable hibernation. + +#### `database.max_databases` (default: 100) +Maximum number of databases this node can host simultaneously. + +#### `database.port_range_*` +Port ranges for dynamic allocation. Ensure ranges are large enough for `max_databases * 2` ports (HTTP + Raft per database). + +## Client Usage + +### Creating/Accessing Databases + +```go +package main + +import ( + "context" + "github.com/DeBrosOfficial/network/pkg/client" +) + +func main() { + // Create client with app name for namespacing + cfg := client.DefaultClientConfig("myapp") + cfg.BootstrapPeers = []string{ + "/ip4/127.0.0.1/tcp/4001/p2p/...", + } + + c, err := client.NewClient(cfg) + if err != nil { + panic(err) + } + + // Connect to network + if err := c.Connect(); err != nil { + panic(err) + } + defer c.Disconnect() + + // Get database client (creates database if it doesn't exist) + db, err := c.Database().Database("users") + if err != nil { + panic(err) + } + + // Use the database + ctx := context.Background() + err = db.CreateTable(ctx, ` + CREATE TABLE users ( + id INTEGER PRIMARY KEY, + name TEXT NOT NULL, + email TEXT UNIQUE + ) + `) + + // Query data + result, err := db.Query(ctx, "SELECT * FROM users") + // ... +} +``` + +### Database Naming + +Databases are automatically namespaced by your application name: +- `client.Database("users")` → creates `myapp_users` internally +- This prevents name collisions between different applications + +## Gateway API Usage + +If you prefer HTTP/REST API access instead of the Go client, you can use the gateway endpoints: + +### Base URL +``` +http://gateway-host:8080/v1/database/ +``` + +### Execute SQL (INSERT, UPDATE, DELETE, DDL) +```bash +POST /v1/database/exec +Content-Type: application/json + +{ + "database": "users", + "sql": "INSERT INTO users (name, email) VALUES (?, ?)", + "args": ["Alice", "alice@example.com"] +} + +Response: +{ + "rows_affected": 1, + "last_insert_id": 1 +} +``` + +### Query Data (SELECT) +```bash +POST /v1/database/query +Content-Type: application/json + +{ + "database": "users", + "sql": "SELECT * FROM users WHERE name LIKE ?", + "args": ["A%"] +} + +Response: +{ + "items": [ + {"id": 1, "name": "Alice", "email": "alice@example.com"} + ], + "count": 1 +} +``` + +### Execute Transaction +```bash +POST /v1/database/transaction +Content-Type: application/json + +{ + "database": "users", + "queries": [ + "INSERT INTO users (name, email) VALUES ('Bob', 'bob@example.com')", + "UPDATE users SET email = 'alice.new@example.com' WHERE name = 'Alice'" + ] +} + +Response: +{ + "success": true +} +``` + +### Get Schema +```bash +GET /v1/database/schema?database=users + +# OR + +POST /v1/database/schema +Content-Type: application/json + +{ + "database": "users" +} + +Response: +{ + "tables": [ + { + "name": "users", + "columns": ["id", "name", "email"] + } + ] +} +``` + +### Create Table +```bash +POST /v1/database/create-table +Content-Type: application/json + +{ + "database": "users", + "schema": "CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT, email TEXT)" +} + +Response: +{ + "rows_affected": 0 +} +``` + +### Drop Table +```bash +POST /v1/database/drop-table +Content-Type: application/json + +{ + "database": "users", + "table_name": "old_table" +} + +Response: +{ + "rows_affected": 0 +} +``` + +### List Databases +```bash +GET /v1/database/list + +Response: +{ + "databases": ["users", "products", "orders"] +} +``` + +### Important Notes + +1. **Authentication Required**: All endpoints require authentication (JWT or API key) +2. **Database Creation**: Databases are created automatically on first access +3. **Hibernation**: The gateway handles hibernation/wake-up transparently - you may experience a delay (< 8s) on first query to a hibernating database +4. **Timeouts**: Query timeout is 30s, transaction timeout is 60s +5. **Namespacing**: Database names are automatically prefixed with your app name +6. **Concurrent Access**: All endpoints are safe for concurrent use + +## Database Lifecycle + +### 1. Creation + +When you first access a database: + +1. **Request Broadcast** - Node broadcasts `DATABASE_CREATE_REQUEST` +2. **Node Selection** - Eligible nodes respond with available ports +3. **Coordinator Selection** - Deterministic coordinator (lowest peer ID) chosen +4. **Confirmation** - Coordinator selects nodes and broadcasts `DATABASE_CREATE_CONFIRM` +5. **Instance Startup** - Selected nodes start rqlite subprocesses +6. **Readiness** - Nodes report `active` status when ready + +**Typical creation time: < 10 seconds** + +### 2. Active State + +- Database instances run as rqlite subprocesses +- Each instance tracks `LastQuery` timestamp +- Queries update the activity timestamp +- Metadata synced across all network nodes + +### 3. Hibernation + +After `hibernation_timeout` of inactivity: + +1. **Idle Detection** - Nodes detect idle databases +2. **Idle Notification** - Nodes broadcast idle status +3. **Coordinated Shutdown** - When all nodes report idle, coordinator schedules shutdown +4. **Graceful Stop** - SIGTERM sent to rqlite processes +5. **Port Release** - Ports freed for reuse +6. **Status Update** - Metadata updated to `hibernating` + +**Data persists on disk during hibernation** + +### 4. Wake-Up + +On first query to hibernating database: + +1. **Detection** - Client/node detects `hibernating` status +2. **Wake Request** - Broadcast `DATABASE_WAKEUP_REQUEST` +3. **Port Allocation** - Reuse original ports or allocate new ones +4. **Instance Restart** - Restart rqlite with existing data +5. **Status Update** - Update to `active` when ready + +**Typical wake-up time: < 8 seconds** + +### 5. Failure Recovery + +When a node fails: + +1. **Health Detection** - Missed health checks trigger failure detection +2. **Replacement Request** - Surviving nodes broadcast `NODE_REPLACEMENT_NEEDED` +3. **Offers** - Healthy nodes with capacity offer to replace +4. **Selection** - First offer accepted (simple approach) +5. **Join Cluster** - New node joins existing Raft cluster +6. **Sync** - Data synced from existing members + +## Data Management + +### Data Directories + +Each database gets its own data directory: +``` +./data/ + ├── myapp_users/ # Database: users + │ └── rqlite/ + │ ├── db.sqlite + │ └── raft/ + ├── myapp_products/ # Database: products + │ └── rqlite/ + └── myapp_orders/ # Database: orders + └── rqlite/ +``` + +### Orphaned Data Cleanup + +On node startup, the system automatically: +- Scans data directories +- Checks against metadata +- Removes directories for: + - Non-existent databases + - Databases where this node is not a member + +## Monitoring & Debugging + +### Structured Logging + +All operations are logged with structured fields: + +``` +INFO Starting cluster manager node_id=12D3... max_databases=100 +INFO Received database create request database=myapp_users requester=12D3... +INFO Database instance started database=myapp_users http_port=5001 raft_port=7001 +INFO Database is idle database=myapp_users idle_time=62s +INFO Database hibernated successfully database=myapp_users +INFO Received wakeup request database=myapp_users +INFO Database woke up successfully database=myapp_users +``` + +### Health Checks + +Nodes perform periodic health checks: +- Every `health_check_interval` (default: 10s) +- Tracks last-seen time for each peer +- 3 missed checks → node marked unhealthy +- Triggers replacement protocol for affected databases + +## Best Practices + +### 1. **Capacity Planning** + +```yaml +# For 100 databases with 3-node replication: +database: + max_databases: 100 + port_range_http_start: 5001 + port_range_http_end: 5200 # 200 ports (100 databases * 2) + port_range_raft_start: 7001 + port_range_raft_end: 7200 +``` + +### 2. **Hibernation Tuning** + +- **High Traffic**: Set `hibernation_timeout: 300s` or higher +- **Development**: Set `hibernation_timeout: 30s` for faster cycles +- **Always-On DBs**: Set `hibernation_timeout: 0` to disable + +### 3. **Replication Factor** + +- **Development**: `replication_factor: 1` (single node, no replication) +- **Production**: `replication_factor: 3` (fault tolerant) +- **High Availability**: `replication_factor: 5` (survives 2 failures) + +### 4. **Network Topology** + +- Use at least 3 nodes for `replication_factor: 3` +- Ensure `max_databases * replication_factor <= total_cluster_capacity` +- Example: 3 nodes × 100 max_databases = 300 database instances total + +## Troubleshooting + +### Database Creation Fails + +**Problem**: `insufficient nodes responded: got 1, need 3` + +**Solution**: +- Ensure you have at least `replication_factor` nodes online +- Check `max_databases` limit on nodes +- Verify port ranges aren't exhausted + +### Database Not Waking Up + +**Problem**: Database stays in `waking` status + +**Solution**: +- Check node logs for rqlite startup errors +- Verify rqlite binary is installed +- Check port conflicts (use different port ranges) +- Ensure data directory is accessible + +### Orphaned Data + +**Problem**: Disk space consumed by old databases + +**Solution**: +- Orphaned data is automatically cleaned on node restart +- Manual cleanup: Delete directories from `./data/` that don't match metadata +- Check logs for reconciliation results + +### Node Replacement Not Working + +**Problem**: Failed node not replaced + +**Solution**: +- Ensure remaining nodes have capacity (`CurrentDatabases < MaxDatabases`) +- Check network connectivity between nodes +- Verify health check interval is reasonable (not too aggressive) + +## Advanced Topics + +### Metadata Consistency + +- **Vector Clocks**: Each metadata update includes vector clock for conflict resolution +- **Gossip Protocol**: Periodic metadata sync via checksums +- **Eventual Consistency**: All nodes eventually agree on database state + +### Port Management + +- Ports allocated randomly within configured ranges +- Bind-probing ensures ports are actually available +- Ports reused during wake-up when possible +- Failed allocations fall back to new random ports + +### Coordinator Election + +- Deterministic selection based on lexicographical peer ID ordering +- Lowest peer ID becomes coordinator +- No persistent coordinator state +- Re-election occurs for each database operation + +## Migration from Legacy Mode + +If upgrading from single-cluster rqlite: + +1. **Backup Data**: Backup your existing `./data/rqlite` directory +2. **Update Config**: Remove deprecated fields: + - `database.data_dir` + - `database.rqlite_port` + - `database.rqlite_raft_port` + - `database.rqlite_join_address` +3. **Add New Fields**: Configure dynamic clustering (see Configuration section) +4. **Restart Nodes**: Restart all nodes with new configuration +5. **Migrate Data**: Create new database and import data from backup + +## Future Enhancements + +The following features are planned for future releases: + +### **Advanced Metrics** (Future) +- Prometheus-style metrics export +- Per-database query counters +- Hibernation/wake-up latency histograms +- Resource utilization gauges + +### **Performance Benchmarks** (Future) +- Automated benchmark suite +- Creation time SLOs +- Wake-up latency targets +- Query overhead measurements + +### **Enhanced Monitoring** (Future) +- Dashboard for cluster visualization +- Database status API endpoint +- Capacity planning tools +- Alerting integration + +## Support + +For issues, questions, or contributions: +- GitHub Issues: https://github.com/DeBrosOfficial/network/issues +- Documentation: https://github.com/DeBrosOfficial/network/blob/main/DYNAMIC_DATABASE_CLUSTERING.md + +## License + +See LICENSE file for details. + diff --git a/Makefile b/Makefile index f0e8f22..70b5994 100644 --- a/Makefile +++ b/Makefile @@ -21,7 +21,7 @@ test-e2e: .PHONY: build clean test run-node run-node2 run-node3 run-example deps tidy fmt vet lint clear-ports -VERSION := 0.51.0-beta +VERSION := 0.60.0-beta COMMIT ?= $(shell git rev-parse --short HEAD 2>/dev/null || echo unknown) DATE ?= $(shell date -u +%Y-%m-%dT%H:%M:%SZ) LDFLAGS := -X 'main.version=$(VERSION)' -X 'main.commit=$(COMMIT)' -X 'main.date=$(DATE)' @@ -53,13 +53,25 @@ run-node: # Usage: make run-node2 JOINADDR=/ip4/127.0.0.1/tcp/5001 HTTP=5002 RAFT=7002 P2P=4002 run-node2: @echo "Starting regular node2 with config..." - go run ./cmd/node --config configs/node.yaml + go run ./cmd/node --config configs/node.yaml -id node2 -p2p-port 4002 # Run third node (regular) - requires join address of bootstrap node # Usage: make run-node3 JOINADDR=/ip4/127.0.0.1/tcp/5001 HTTP=5003 RAFT=7003 P2P=4003 run-node3: @echo "Starting regular node3 with config..." - go run ./cmd/node --config configs/node.yaml + go run ./cmd/node --config configs/node.yaml -id node3 -p2p-port 4003 + +run-node4: + @echo "Starting regular node4 with config..." + go run ./cmd/node --config configs/node.yaml -id node4 -p2p-port 4004 + +run-node5: + @echo "Starting regular node5 with config..." + go run ./cmd/node --config configs/node.yaml -id node5 -p2p-port 4005 + +run-node6: + @echo "Starting regular node6 with config..." + go run ./cmd/node --config configs/node.yaml -id node6 -p2p-port 4006 # Run gateway HTTP server # Usage examples: diff --git a/TESTING_GUIDE.md b/TESTING_GUIDE.md new file mode 100644 index 0000000..85d189b --- /dev/null +++ b/TESTING_GUIDE.md @@ -0,0 +1,827 @@ +# Dynamic Database Clustering - Testing Guide + +This guide provides a comprehensive list of unit tests, integration tests, and manual tests needed to verify the dynamic database clustering feature. + +## Unit Tests + +### 1. Metadata Store Tests (`pkg/rqlite/metadata_test.go`) + +```go +// Test cases to implement: + +func TestMetadataStore_GetSetDatabase(t *testing.T) + - Create store + - Set database metadata + - Get database metadata + - Verify data matches + +func TestMetadataStore_DeleteDatabase(t *testing.T) + - Set database metadata + - Delete database + - Verify Get returns nil + +func TestMetadataStore_ListDatabases(t *testing.T) + - Add multiple databases + - List all databases + - Verify count and contents + +func TestMetadataStore_ConcurrentAccess(t *testing.T) + - Spawn multiple goroutines + - Concurrent reads and writes + - Verify no race conditions (run with -race) + +func TestMetadataStore_NodeCapacity(t *testing.T) + - Set node capacity + - Get node capacity + - Update capacity + - List nodes +``` + +### 2. Vector Clock Tests (`pkg/rqlite/vector_clock_test.go`) + +```go +func TestVectorClock_Increment(t *testing.T) + - Create empty vector clock + - Increment for node A + - Verify counter is 1 + - Increment again + - Verify counter is 2 + +func TestVectorClock_Merge(t *testing.T) + - Create two vector clocks with different nodes + - Merge them + - Verify max values are preserved + +func TestVectorClock_Compare(t *testing.T) + - Test strictly less than case + - Test strictly greater than case + - Test concurrent case + - Test identical case + +func TestVectorClock_Concurrent(t *testing.T) + - Create clocks with overlapping updates + - Verify Compare returns 0 (concurrent) +``` + +### 3. Consensus Tests (`pkg/rqlite/consensus_test.go`) + +```go +func TestElectCoordinator_SingleNode(t *testing.T) + - Pass single node ID + - Verify it's elected + +func TestElectCoordinator_MultipleNodes(t *testing.T) + - Pass multiple node IDs + - Verify lowest lexicographical ID wins + - Verify deterministic (same input = same output) + +func TestElectCoordinator_EmptyList(t *testing.T) + - Pass empty list + - Verify error returned + +func TestElectCoordinator_Deterministic(t *testing.T) + - Run election multiple times with same inputs + - Verify same coordinator each time +``` + +### 4. Port Manager Tests (`pkg/rqlite/ports_test.go`) + +```go +func TestPortManager_AllocatePortPair(t *testing.T) + - Create manager with port range + - Allocate port pair + - Verify HTTP and Raft ports different + - Verify ports within range + +func TestPortManager_ReleasePortPair(t *testing.T) + - Allocate port pair + - Release ports + - Verify ports can be reallocated + +func TestPortManager_Exhaustion(t *testing.T) + - Allocate all available ports + - Attempt one more allocation + - Verify error returned + +func TestPortManager_IsPortAllocated(t *testing.T) + - Allocate ports + - Check IsPortAllocated returns true + - Release ports + - Check IsPortAllocated returns false + +func TestPortManager_AllocateSpecificPorts(t *testing.T) + - Allocate specific ports + - Verify allocation succeeds + - Attempt to allocate same ports again + - Verify error returned +``` + +### 5. RQLite Instance Tests (`pkg/rqlite/instance_test.go`) + +```go +func TestRQLiteInstance_Create(t *testing.T) + - Create instance configuration + - Verify fields set correctly + +func TestRQLiteInstance_IsIdle(t *testing.T) + - Set LastQuery to old timestamp + - Verify IsIdle returns true + - Update LastQuery + - Verify IsIdle returns false + +// Integration test (requires rqlite binary): +func TestRQLiteInstance_StartStop(t *testing.T) + - Create instance + - Start instance + - Verify HTTP endpoint responsive + - Stop instance + - Verify process terminated +``` + +### 6. Pubsub Message Tests (`pkg/rqlite/pubsub_messages_test.go`) + +```go +func TestMarshalUnmarshalMetadataMessage(t *testing.T) + - Create each message type + - Marshal to bytes + - Unmarshal back + - Verify data preserved + +func TestDatabaseCreateRequest_Marshal(t *testing.T) +func TestDatabaseCreateResponse_Marshal(t *testing.T) +func TestDatabaseCreateConfirm_Marshal(t *testing.T) +func TestDatabaseStatusUpdate_Marshal(t *testing.T) +// ... for all message types +``` + +### 7. Coordinator Tests (`pkg/rqlite/coordinator_test.go`) + +```go +func TestCreateCoordinator_AddResponse(t *testing.T) + - Create coordinator + - Add responses + - Verify response count + +func TestCreateCoordinator_SelectNodes(t *testing.T) + - Add more responses than needed + - Call SelectNodes + - Verify correct number selected + - Verify deterministic selection + +func TestCreateCoordinator_WaitForResponses(t *testing.T) + - Create coordinator + - Wait in goroutine + - Add responses from another goroutine + - Verify wait completes when enough responses + +func TestCoordinatorRegistry(t *testing.T) + - Register coordinator + - Get coordinator + - Remove coordinator + - Verify lifecycle +``` + +## Integration Tests + +### 1. Single Node Database Creation (`e2e/single_node_database_test.go`) + +```go +func TestSingleNodeDatabaseCreation(t *testing.T) + - Start 1 node + - Set replication_factor = 1 + - Create database + - Verify database active + - Write data + - Read data back + - Verify data matches +``` + +### 2. Three Node Database Creation (`e2e/three_node_database_test.go`) + +```go +func TestThreeNodeDatabaseCreation(t *testing.T) + - Start 3 nodes + - Set replication_factor = 3 + - Create database from node 1 + - Wait for all nodes to report active + - Write data to node 1 + - Read from node 2 + - Verify replication worked +``` + +### 3. Multiple Databases (`e2e/multiple_databases_test.go`) + +```go +func TestMultipleDatabases(t *testing.T) + - Start 3 nodes + - Create database "users" + - Create database "products" + - Create database "orders" + - Verify all databases active + - Write to each database + - Verify data isolation +``` + +### 4. Hibernation Cycle (`e2e/hibernation_test.go`) + +```go +func TestHibernationCycle(t *testing.T) + - Start 3 nodes with hibernation_timeout=5s + - Create database + - Write initial data + - Wait 10 seconds (no activity) + - Verify status = hibernating + - Verify processes stopped + - Verify data persisted on disk + +func TestWakeUpCycle(t *testing.T) + - Create and hibernate database + - Issue query + - Wait for wake-up + - Verify status = active + - Verify data still accessible + - Verify LastQuery updated +``` + +### 5. Node Failure and Recovery (`e2e/failure_recovery_test.go`) + +```go +func TestNodeFailureDetection(t *testing.T) + - Start 3 nodes + - Create database + - Kill one node (SIGKILL) + - Wait for health checks to detect failure + - Verify NODE_REPLACEMENT_NEEDED broadcast + +func TestNodeReplacement(t *testing.T) + - Start 4 nodes + - Create database on nodes 1,2,3 + - Kill node 3 + - Wait for replacement + - Verify node 4 joins cluster + - Verify data accessible from node 4 +``` + +### 6. Orphaned Data Cleanup (`e2e/cleanup_test.go`) + +```go +func TestOrphanedDataCleanup(t *testing.T) + - Start node + - Manually create orphaned data directory + - Restart node + - Verify orphaned directory removed + - Check logs for reconciliation message +``` + +### 7. Concurrent Operations (`e2e/concurrent_test.go`) + +```go +func TestConcurrentDatabaseCreation(t *testing.T) + - Start 5 nodes + - Create 10 databases concurrently + - Verify all successful + - Verify no port conflicts + - Verify proper distribution + +func TestConcurrentHibernation(t *testing.T) + - Create multiple databases + - Let all go idle + - Verify all hibernate correctly + - No race conditions +``` + +## Manual Test Scenarios + +### Test 1: Basic Flow - Three Node Cluster + +**Setup:** +```bash +# Terminal 1: Bootstrap node +cd data/bootstrap +../../bin/node --data bootstrap --id bootstrap --p2p-port 4001 + +# Terminal 2: Node 2 +cd data/node +../../bin/node --data node --id node2 --p2p-port 4002 + +# Terminal 3: Node 3 +cd data/node2 +../../bin/node --data node2 --id node3 --p2p-port 4003 +``` + +**Test Steps:** +1. **Create Database** + ```bash + # Use client or API to create database "testdb" + ``` + +2. **Verify Creation** + - Check logs on all 3 nodes for "Database instance started" + - Verify `./data/*/testdb/` directories exist on all nodes + - Check different ports allocated on each node + +3. **Write Data** + ```sql + CREATE TABLE users (id INT, name TEXT); + INSERT INTO users VALUES (1, 'Alice'); + INSERT INTO users VALUES (2, 'Bob'); + ``` + +4. **Verify Replication** + - Query from each node + - Verify same data returned + +**Expected Results:** +- All nodes show `status=active` for testdb +- Data replicated across all nodes +- Unique port pairs per node + +--- + +### Test 2: Hibernation and Wake-Up + +**Setup:** Same as Test 1 with database created + +**Test Steps:** +1. **Check Activity** + ```bash + # In logs, verify "last_query" timestamps updating on queries + ``` + +2. **Wait for Hibernation** + - Stop issuing queries + - Wait `hibernation_timeout` + 10s + - Check logs for "Database is idle" + - Verify "Coordinated shutdown message sent" + - Verify "Database hibernated successfully" + +3. **Verify Hibernation** + ```bash + # Check that rqlite processes are stopped + ps aux | grep rqlite + + # Verify data directories still exist + ls -la data/*/testdb/ + ``` + +4. **Wake Up** + - Issue a query to the database + - Watch logs for "Received wakeup request" + - Verify "Database woke up successfully" + - Verify query succeeds + +**Expected Results:** +- Hibernation happens after idle timeout +- All 3 nodes hibernate coordinated +- Wake-up completes in < 8 seconds +- Data persists across hibernation cycle + +--- + +### Test 3: Multiple Databases + +**Setup:** 3 nodes running + +**Test Steps:** +1. **Create Multiple Databases** + ``` + Create: users_db + Create: products_db + Create: orders_db + ``` + +2. **Verify Isolation** + - Insert data in users_db + - Verify data NOT in products_db + - Verify data NOT in orders_db + +3. **Check Port Allocation** + ```bash + # Verify different ports for each database + netstat -tlnp | grep rqlite + # OR + ss -tlnp | grep rqlite + ``` + +4. **Verify Data Directories** + ```bash + tree data/bootstrap/ + # Should show: + # ├── users_db/ + # ├── products_db/ + # └── orders_db/ + ``` + +**Expected Results:** +- 3 separate database clusters +- Each with 3 nodes (9 total instances) +- Complete data isolation +- Unique port pairs for each instance + +--- + +### Test 4: Node Failure and Recovery + +**Setup:** 4 nodes running, database created on nodes 1-3 + +**Test Steps:** +1. **Verify Initial State** + - Database active on nodes 1, 2, 3 + - Node 4 idle + +2. **Simulate Failure** + ```bash + # Kill node 3 (SIGKILL for unclean shutdown) + kill -9 + ``` + +3. **Watch for Detection** + - Check logs on nodes 1 and 2 + - Wait for health check failures (3 missed pings) + - Verify "Node detected as unhealthy" messages + +4. **Watch for Replacement** + - Check for "NODE_REPLACEMENT_NEEDED" broadcast + - Node 4 should offer to replace + - Verify "Starting as replacement node" on node 4 + - Verify node 4 joins Raft cluster + +5. **Verify Data Integrity** + - Query database from node 4 + - Verify all data present + - Insert new data from node 4 + - Verify replication to nodes 1 and 2 + +**Expected Results:** +- Failure detected within 30 seconds +- Replacement completes automatically +- Data accessible from new node +- No data loss + +--- + +### Test 5: Port Exhaustion + +**Setup:** 1 node with small port range + +**Configuration:** +```yaml +database: + max_databases: 10 + port_range_http_start: 5001 + port_range_http_end: 5005 # Only 5 ports + port_range_raft_start: 7001 + port_range_raft_end: 7005 # Only 5 ports +``` + +**Test Steps:** +1. **Create Databases** + - Create database 1 (succeeds - uses 2 ports) + - Create database 2 (succeeds - uses 2 ports) + - Create database 3 (fails - only 1 port left) + +2. **Verify Error** + - Check logs for "Cannot allocate ports" + - Verify error returned to client + +3. **Free Ports** + - Hibernate or delete database 1 + - Ports should be freed + +4. **Retry** + - Create database 3 again + - Should succeed now + +**Expected Results:** +- Graceful handling of port exhaustion +- Clear error messages +- Ports properly recycled + +--- + +### Test 6: Orphaned Data Cleanup + +**Setup:** 1 node stopped + +**Test Steps:** +1. **Create Orphaned Data** + ```bash + # While node is stopped + mkdir -p data/bootstrap/orphaned_db/rqlite + echo "fake data" > data/bootstrap/orphaned_db/rqlite/db.sqlite + ``` + +2. **Start Node** + ```bash + ./bin/node --data bootstrap --id bootstrap + ``` + +3. **Check Reconciliation** + - Watch logs for "Starting orphaned data reconciliation" + - Verify "Found orphaned database directory" + - Verify "Removed orphaned database directory" + +4. **Verify Cleanup** + ```bash + ls data/bootstrap/ + # orphaned_db should be gone + ``` + +**Expected Results:** +- Orphaned directories automatically detected +- Removed on startup +- Clean reconciliation logged + +--- + +### Test 7: Stress Test - Many Databases + +**Setup:** 5 nodes with high capacity + +**Configuration:** +```yaml +database: + max_databases: 50 + port_range_http_start: 5001 + port_range_http_end: 5150 + port_range_raft_start: 7001 + port_range_raft_end: 7150 +``` + +**Test Steps:** +1. **Create Many Databases** + ``` + Loop: Create databases db_1 through db_25 + ``` + +2. **Verify Distribution** + - Check logs for node capacity announcements + - Verify databases distributed across nodes + - No single node overloaded + +3. **Concurrent Operations** + - Write to multiple databases simultaneously + - Read from multiple databases + - Verify no conflicts + +4. **Hibernation Wave** + - Stop all activity + - Wait for hibernation + - Verify all databases hibernate + - Check resource usage drops + +5. **Wake-Up Storm** + - Query all 25 databases at once + - Verify all wake up successfully + - Check for thundering herd issues + +**Expected Results:** +- All 25 databases created successfully +- Even distribution across nodes +- No port conflicts +- Successful mass hibernation/wake-up + +--- + +### Test 8: Gateway API Access + +**Setup:** Gateway running with 3 nodes + +**Test Steps:** +1. **Authenticate** + ```bash + # Get JWT token + TOKEN=$(curl -X POST http://localhost:8080/v1/auth/login \ + -H "Content-Type: application/json" \ + -d '{"wallet": "..."}' | jq -r .token) + ``` + +2. **Create Table** + ```bash + curl -X POST http://localhost:8080/v1/database/create-table \ + -H "Authorization: Bearer $TOKEN" \ + -H "Content-Type: application/json" \ + -d '{ + "database": "testdb", + "schema": "CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT, email TEXT)" + }' + ``` + +3. **Insert Data** + ```bash + curl -X POST http://localhost:8080/v1/database/exec \ + -H "Authorization: Bearer $TOKEN" \ + -H "Content-Type: application/json" \ + -d '{ + "database": "testdb", + "sql": "INSERT INTO users (name, email) VALUES (?, ?)", + "args": ["Alice", "alice@example.com"] + }' + ``` + +4. **Query Data** + ```bash + curl -X POST http://localhost:8080/v1/database/query \ + -H "Authorization: Bearer $TOKEN" \ + -H "Content-Type: application/json" \ + -d '{ + "database": "testdb", + "sql": "SELECT * FROM users" + }' + ``` + +5. **Test Transaction** + ```bash + curl -X POST http://localhost:8080/v1/database/transaction \ + -H "Authorization: Bearer $TOKEN" \ + -H "Content-Type: application/json" \ + -d '{ + "database": "testdb", + "queries": [ + "INSERT INTO users (name, email) VALUES (\"Bob\", \"bob@example.com\")", + "INSERT INTO users (name, email) VALUES (\"Charlie\", \"charlie@example.com\")" + ] + }' + ``` + +6. **Get Schema** + ```bash + curl -X GET "http://localhost:8080/v1/database/schema?database=testdb" \ + -H "Authorization: Bearer $TOKEN" + ``` + +7. **Test Hibernation** + - Wait for hibernation timeout + - Query again and measure wake-up time + - Should see delay on first query after hibernation + +**Expected Results:** +- All API calls succeed +- Data persists across calls +- Transactions are atomic +- Schema reflects created tables +- Hibernation/wake-up transparent to API +- Response times reasonable (< 30s for queries) + +--- + +## Test Checklist + +### Unit Tests (To Implement) +- [ ] Metadata Store operations +- [ ] Metadata Store concurrency +- [ ] Vector Clock increment +- [ ] Vector Clock merge +- [ ] Vector Clock compare +- [ ] Coordinator election (single node) +- [ ] Coordinator election (multiple nodes) +- [ ] Coordinator election (deterministic) +- [ ] Port Manager allocation +- [ ] Port Manager release +- [ ] Port Manager exhaustion +- [ ] Port Manager specific ports +- [ ] RQLite Instance creation +- [ ] RQLite Instance IsIdle +- [ ] Message marshal/unmarshal (all types) +- [ ] Coordinator response collection +- [ ] Coordinator node selection +- [ ] Coordinator registry + +### Integration Tests (To Implement) +- [ ] Single node database creation +- [ ] Three node database creation +- [ ] Multiple databases isolation +- [ ] Hibernation cycle +- [ ] Wake-up cycle +- [ ] Node failure detection +- [ ] Node replacement +- [ ] Orphaned data cleanup +- [ ] Concurrent database creation +- [ ] Concurrent hibernation + +### Manual Tests (To Perform) +- [ ] Basic three node flow +- [ ] Hibernation and wake-up +- [ ] Multiple databases +- [ ] Node failure and recovery +- [ ] Port exhaustion handling +- [ ] Orphaned data cleanup +- [ ] Stress test with many databases + +### Performance Validation +- [ ] Database creation < 10s +- [ ] Wake-up time < 8s +- [ ] Metadata sync < 5s +- [ ] Query overhead < 10ms additional + +## Running Tests + +### Unit Tests +```bash +# Run all tests +go test ./pkg/rqlite/... -v + +# Run with race detector +go test ./pkg/rqlite/... -race + +# Run specific test +go test ./pkg/rqlite/ -run TestMetadataStore_GetSetDatabase -v + +# Run with coverage +go test ./pkg/rqlite/... -cover -coverprofile=coverage.out +go tool cover -html=coverage.out +``` + +### Integration Tests +```bash +# Run e2e tests +go test ./e2e/... -v -timeout 30m + +# Run specific e2e test +go test ./e2e/ -run TestThreeNodeDatabaseCreation -v +``` + +### Manual Tests +Follow the scenarios above in dedicated terminals for each node. + +## Success Criteria + +### Correctness +✅ All unit tests pass +✅ All integration tests pass +✅ All manual scenarios complete successfully +✅ No data loss in any scenario +✅ No race conditions detected + +### Performance +✅ Database creation < 10 seconds +✅ Wake-up < 8 seconds +✅ Metadata sync < 5 seconds +✅ Query overhead < 10ms + +### Reliability +✅ Survives node failures +✅ Automatic recovery works +✅ No orphaned data accumulates +✅ Hibernation/wake-up cycles stable +✅ Concurrent operations safe + +## Notes for Future Test Enhancements + +When implementing advanced metrics and benchmarks: + +1. **Prometheus Metrics Tests** + - Verify metric export + - Validate metric values + - Test metric reset on restart + +2. **Benchmark Suite** + - Automated performance regression detection + - Latency percentile tracking (p50, p95, p99) + - Throughput measurements + - Resource usage profiling + +3. **Chaos Engineering** + - Random node kills + - Network partitions + - Clock skew simulation + - Disk full scenarios + +4. **Long-Running Stability** + - 24-hour soak test + - Memory leak detection + - Slow-growing resource usage + +## Debugging Failed Tests + +### Common Issues + +**Port Conflicts** +```bash +# Check for processes using test ports +lsof -i :5001-5999 +lsof -i :7001-7999 + +# Kill stale processes +pkill rqlited +``` + +**Stale Data** +```bash +# Clean test data directories +rm -rf data/test_*/ +rm -rf /tmp/debros_test_*/ +``` + +**Timing Issues** +- Increase timeouts in flaky tests +- Add retry logic with exponential backoff +- Use proper synchronization primitives + +**Race Conditions** +```bash +# Always run with race detector during development +go test -race ./... +``` + + diff --git a/cmd/node/main.go b/cmd/node/main.go index 53fb562..dcdcd16 100644 --- a/cmd/node/main.go +++ b/cmd/node/main.go @@ -31,16 +31,13 @@ func setup_logger(component logging.Component) (logger *logging.ColoredLogger) { } // parse_and_return_network_flags it initializes all the network flags coming from the .yaml files -func parse_and_return_network_flags() (configPath *string, dataDir, nodeID *string, p2pPort, rqlHTTP, rqlRaft *int, rqlJoinAddr *string, advAddr *string, help *bool) { +func parse_and_return_network_flags() (configPath *string, dataDir, nodeID *string, p2pPort *int, advAddr *string, help *bool, loadedConfig *config.Config) { logger := setup_logger(logging.ComponentNode) configPath = flag.String("config", "", "Path to config YAML file (overrides defaults)") dataDir = flag.String("data", "", "Data directory (auto-detected if not provided)") nodeID = flag.String("id", "", "Node identifier (for running multiple local nodes)") p2pPort = flag.Int("p2p-port", 4001, "LibP2P listen port") - rqlHTTP = flag.Int("rqlite-http-port", 5001, "RQLite HTTP API port") - rqlRaft = flag.Int("rqlite-raft-port", 7001, "RQLite Raft port") - rqlJoinAddr = flag.String("rqlite-join-address", "", "RQLite address to join (e.g., /ip4/)") advAddr = flag.String("adv-addr", "127.0.0.1", "Default Advertise address for rqlite and rafts") help = flag.Bool("help", false, "Show help") flag.Parse() @@ -55,33 +52,18 @@ func parse_and_return_network_flags() (configPath *string, dataDir, nodeID *stri } logger.ComponentInfo(logging.ComponentNode, "Configuration loaded from YAML file", zap.String("path", *configPath)) - // Instead of returning flag values, return config values - // For ListenAddresses, extract port from multiaddr string if possible, else use default - var p2pPortVal int - if len(cfg.Node.ListenAddresses) > 0 { - // Try to parse port from multiaddr string - var port int - _, err := fmt.Sscanf(cfg.Node.ListenAddresses[0], "/ip4/0.0.0.0/tcp/%d", &port) - if err == nil { - p2pPortVal = port - } else { - p2pPortVal = 4001 - } - } else { - p2pPortVal = 4001 - } + // Return config values but preserve command line flag values for overrides + // The command line flags will be applied later in load_args_into_config return configPath, &cfg.Node.DataDir, &cfg.Node.ID, - &p2pPortVal, - &cfg.Database.RQLitePort, - &cfg.Database.RQLiteRaftPort, - &cfg.Database.RQLiteJoinAddress, + p2pPort, // Keep the command line flag value &cfg.Discovery.HttpAdvAddress, - help + help, + cfg // Return the loaded config } - return + return configPath, dataDir, nodeID, p2pPort, advAddr, help, nil } // LoadConfigFromYAML loads a config from a YAML file @@ -109,8 +91,13 @@ func check_if_should_open_help(help *bool) { func select_data_dir(dataDir *string, nodeID *string) { logger := setup_logger(logging.ComponentNode) - if *nodeID == "" { - *dataDir = "./data/node" + // If dataDir is not set from config, set it based on nodeID + if *dataDir == "" { + if *nodeID == "" { + *dataDir = "./data/node" + } else { + *dataDir = fmt.Sprintf("./data/%s", *nodeID) + } } logger.Info("Successfully selected Data Directory of: %s", zap.String("dataDir", *dataDir)) @@ -151,38 +138,30 @@ func startNode(ctx context.Context, cfg *config.Config, port int) error { } // load_args_into_config applies command line argument overrides to the config -func load_args_into_config(cfg *config.Config, p2pPort, rqlHTTP, rqlRaft *int, rqlJoinAddr *string, advAddr *string, dataDir *string) { +func load_args_into_config(cfg *config.Config, p2pPort *int, advAddr *string, dataDir *string) { logger := setup_logger(logging.ComponentNode) - // Apply RQLite HTTP port override - if *rqlHTTP != 5001 { - cfg.Database.RQLitePort = *rqlHTTP - logger.ComponentInfo(logging.ComponentNode, "Overriding RQLite HTTP port", zap.Int("port", *rqlHTTP)) + // Apply P2P port override - check if command line port differs from config + var configPort int = 4001 // default + if len(cfg.Node.ListenAddresses) > 0 { + // Try to parse port from multiaddr string in config + _, err := fmt.Sscanf(cfg.Node.ListenAddresses[0], "/ip4/0.0.0.0/tcp/%d", &configPort) + if err != nil { + configPort = 4001 // fallback to default + } } - // Apply RQLite Raft port override - if *rqlRaft != 7001 { - cfg.Database.RQLiteRaftPort = *rqlRaft - logger.ComponentInfo(logging.ComponentNode, "Overriding RQLite Raft port", zap.Int("port", *rqlRaft)) - } - - // Apply P2P port override - if *p2pPort != 4001 { + // Override if command line port is different from config port + if *p2pPort != configPort { cfg.Node.ListenAddresses = []string{ fmt.Sprintf("/ip4/0.0.0.0/tcp/%d", *p2pPort), } logger.ComponentInfo(logging.ComponentNode, "Overriding P2P port", zap.Int("port", *p2pPort)) } - // Apply RQLite join address - if *rqlJoinAddr != "" { - cfg.Database.RQLiteJoinAddress = *rqlJoinAddr - logger.ComponentInfo(logging.ComponentNode, "Setting RQLite join address", zap.String("address", *rqlJoinAddr)) - } - if *advAddr != "" { - cfg.Discovery.HttpAdvAddress = fmt.Sprintf("%s:%d", *advAddr, *rqlHTTP) - cfg.Discovery.RaftAdvAddress = fmt.Sprintf("%s:%d", *advAddr, *rqlRaft) + cfg.Discovery.HttpAdvAddress = *advAddr + cfg.Discovery.RaftAdvAddress = *advAddr } if *dataDir != "" { @@ -193,30 +172,35 @@ func load_args_into_config(cfg *config.Config, p2pPort, rqlHTTP, rqlRaft *int, r func main() { logger := setup_logger(logging.ComponentNode) - _, dataDir, nodeID, p2pPort, rqlHTTP, rqlRaft, rqlJoinAddr, advAddr, help := parse_and_return_network_flags() + _, dataDir, nodeID, p2pPort, advAddr, help, loadedConfig := parse_and_return_network_flags() check_if_should_open_help(help) select_data_dir(dataDir, nodeID) - // Load Node Configuration + // Load Node Configuration - use loaded config if available, otherwise use default var cfg *config.Config - cfg = config.DefaultConfig() - logger.ComponentInfo(logging.ComponentNode, "Default configuration loaded successfully") + if loadedConfig != nil { + cfg = loadedConfig + logger.ComponentInfo(logging.ComponentNode, "Using configuration from YAML file") + } else { + cfg = config.DefaultConfig() + logger.ComponentInfo(logging.ComponentNode, "Using default configuration") + } // Apply command line argument overrides - load_args_into_config(cfg, p2pPort, rqlHTTP, rqlRaft, rqlJoinAddr, advAddr, dataDir) + load_args_into_config(cfg, p2pPort, advAddr, dataDir) logger.ComponentInfo(logging.ComponentNode, "Command line arguments applied to configuration") - // LibP2P uses configurable port (default 4001); RQLite uses 5001 (HTTP) and 7001 (Raft) + // LibP2P uses configurable port (default 4001) port := *p2pPort logger.ComponentInfo(logging.ComponentNode, "Node configuration summary", zap.Strings("listen_addresses", cfg.Node.ListenAddresses), - zap.Int("rqlite_http_port", cfg.Database.RQLitePort), - zap.Int("rqlite_raft_port", cfg.Database.RQLiteRaftPort), zap.Int("p2p_port", port), zap.Strings("bootstrap_peers", cfg.Discovery.BootstrapPeers), - zap.String("rqlite_join_address", cfg.Database.RQLiteJoinAddress), + zap.Int("max_databases", cfg.Database.MaxDatabases), + zap.String("port_range_http", fmt.Sprintf("%d-%d", cfg.Database.PortRangeHTTPStart, cfg.Database.PortRangeHTTPEnd)), + zap.String("port_range_raft", fmt.Sprintf("%d-%d", cfg.Database.PortRangeRaftStart, cfg.Database.PortRangeRaftEnd)), zap.String("data_directory", *dataDir)) // Create context for graceful shutdown diff --git a/pkg/client/defaults.go b/pkg/client/defaults.go index c42f6ae..e764b63 100644 --- a/pkg/client/defaults.go +++ b/pkg/client/defaults.go @@ -1,175 +1,39 @@ package client import ( - "os" - "strconv" - "strings" + "fmt" + "time" "github.com/DeBrosOfficial/network/pkg/config" - "github.com/multiformats/go-multiaddr" ) -// DefaultBootstrapPeers returns the library's default bootstrap peer multiaddrs. -// These can be overridden by environment variables or config. -func DefaultBootstrapPeers() []string { +// DefaultClientConfig returns a default client configuration +func DefaultClientConfig(appName string) *ClientConfig { defaultCfg := config.DefaultConfig() - return defaultCfg.Discovery.BootstrapPeers -} -// DefaultDatabaseEndpoints returns default DB HTTP endpoints. -// These can be overridden by environment variables or config. -func DefaultDatabaseEndpoints() []string { - // Check environment variable first - if envNodes := os.Getenv("RQLITE_NODES"); envNodes != "" { - return normalizeEndpoints(splitCSVOrSpace(envNodes)) - } - - // Get default port from environment or use port from config - defaultCfg := config.DefaultConfig() - port := defaultCfg.Database.RQLitePort - if envPort := os.Getenv("RQLITE_PORT"); envPort != "" { - if p, err := strconv.Atoi(envPort); err == nil && p > 0 { - port = p - } - } - - // Try to derive from bootstrap peers if available - peers := DefaultBootstrapPeers() - if len(peers) > 0 { - endpoints := make([]string, 0, len(peers)) - for _, s := range peers { - ma, err := multiaddr.NewMultiaddr(s) - if err != nil { - continue - } - endpoints = append(endpoints, endpointFromMultiaddr(ma, port)) - } - return dedupeStrings(endpoints) - } - - // Fallback to localhost - return []string{"http://localhost:" + strconv.Itoa(port)} -} - -// MapAddrsToDBEndpoints converts a set of peer multiaddrs to DB HTTP endpoints using dbPort. -func MapAddrsToDBEndpoints(addrs []multiaddr.Multiaddr, dbPort int) []string { - if dbPort <= 0 { - dbPort = 5001 - } - eps := make([]string, 0, len(addrs)) - for _, ma := range addrs { - eps = append(eps, endpointFromMultiaddr(ma, dbPort)) - } - return dedupeStrings(eps) -} - -// endpointFromMultiaddr extracts host from multiaddr and creates HTTP endpoint -func endpointFromMultiaddr(ma multiaddr.Multiaddr, port int) string { - var host string - - // Prefer DNS if present, then IP - if v, err := ma.ValueForProtocol(multiaddr.P_DNS); err == nil && v != "" { - host = v - } - if host == "" { - if v, err := ma.ValueForProtocol(multiaddr.P_DNS4); err == nil && v != "" { - host = v - } - } - if host == "" { - if v, err := ma.ValueForProtocol(multiaddr.P_DNS6); err == nil && v != "" { - host = v - } - } - if host == "" { - if v, err := ma.ValueForProtocol(multiaddr.P_IP4); err == nil && v != "" { - host = v - } - } - if host == "" { - if v, err := ma.ValueForProtocol(multiaddr.P_IP6); err == nil && v != "" { - host = "[" + v + "]" // IPv6 needs brackets in URLs - } - } - if host == "" { - host = "localhost" - } - - return "http://" + host + ":" + strconv.Itoa(port) -} - -// normalizeEndpoints ensures each endpoint has an http scheme and a port (defaults to 5001) -func normalizeEndpoints(in []string) []string { - out := make([]string, 0, len(in)) - for _, s := range in { - s = strings.TrimSpace(s) - if s == "" { - continue - } - - // Prepend scheme if missing - if !strings.HasPrefix(s, "http://") && !strings.HasPrefix(s, "https://") { - s = "http://" + s - } - - // Simple check for port (doesn't handle all cases but good enough) - if !strings.Contains(s, ":5001") && !strings.Contains(s, ":500") && !strings.Contains(s, ":501") { - // Check if there's already a port after the host - parts := strings.Split(s, "://") - if len(parts) == 2 { - hostPart := parts[1] - // Count colons to detect port (simple heuristic) - colonCount := strings.Count(hostPart, ":") - if colonCount == 0 || (strings.Contains(hostPart, "[") && colonCount == 1) { - // No port found, add default - s = s + ":5001" - } - } - } - - out = append(out, s) - } - return out -} - -// dedupeStrings removes duplicate strings from slice -func dedupeStrings(in []string) []string { - if len(in) == 0 { - return in - } - - seen := make(map[string]struct{}, len(in)) - out := make([]string, 0, len(in)) - - for _, s := range in { - s = strings.TrimSpace(s) - if s == "" { - continue - } - if _, ok := seen[s]; ok { - continue - } - seen[s] = struct{}{} - out = append(out, s) - } - - return out -} - -// splitCSVOrSpace splits a string by commas or spaces -func splitCSVOrSpace(s string) []string { - // Replace commas with spaces, then split on spaces - s = strings.ReplaceAll(s, ",", " ") - fields := strings.Fields(s) - return fields -} - -// truthy reports if s is a common truthy string -func truthy(s string) bool { - switch strings.ToLower(strings.TrimSpace(s)) { - case "1", "true", "yes", "on": - return true - default: - return false + return &ClientConfig{ + AppName: appName, + DatabaseName: fmt.Sprintf("%s_db", appName), + BootstrapPeers: defaultCfg.Discovery.BootstrapPeers, + DatabaseEndpoints: []string{}, + ConnectTimeout: 30 * time.Second, + RetryAttempts: 3, + RetryDelay: 5 * time.Second, + QuietMode: false, + APIKey: "", + JWT: "", } } + +// ValidateClientConfig validates a client configuration +func ValidateClientConfig(cfg *ClientConfig) error { + if len(cfg.BootstrapPeers) == 0 { + return fmt.Errorf("at least one bootstrap peer is required") + } + + if cfg.AppName == "" { + return fmt.Errorf("app name is required") + } + + return nil +} diff --git a/pkg/client/defaults_test.go b/pkg/client/defaults_test.go deleted file mode 100644 index 7bb37a5..0000000 --- a/pkg/client/defaults_test.go +++ /dev/null @@ -1,52 +0,0 @@ -package client - -import ( - "os" - "testing" - - "github.com/multiformats/go-multiaddr" -) - -func TestDefaultBootstrapPeersNonEmpty(t *testing.T) { - old := os.Getenv("DEBROS_BOOTSTRAP_PEERS") - t.Cleanup(func() { os.Setenv("DEBROS_BOOTSTRAP_PEERS", old) }) - _ = os.Setenv("DEBROS_BOOTSTRAP_PEERS", "") // ensure not set - peers := DefaultBootstrapPeers() - if len(peers) == 0 { - t.Fatalf("expected non-empty default bootstrap peers") - } -} - -func TestDefaultDatabaseEndpointsEnvOverride(t *testing.T) { - oldNodes := os.Getenv("RQLITE_NODES") - t.Cleanup(func() { os.Setenv("RQLITE_NODES", oldNodes) }) - _ = os.Setenv("RQLITE_NODES", "db1.local:7001, https://db2.local:7443") - endpoints := DefaultDatabaseEndpoints() - if len(endpoints) != 2 { - t.Fatalf("expected 2 endpoints from env, got %v", endpoints) - } -} - -func TestNormalizeEndpoints(t *testing.T) { - in := []string{"db.local", "http://db.local:5001", "[::1]", "https://host:8443"} - out := normalizeEndpoints(in) - if len(out) != 4 { - t.Fatalf("unexpected len: %v", out) - } - foundDefault := false - for _, s := range out { - if s == "http://db.local:5001" { - foundDefault = true - } - } - if !foundDefault { - t.Fatalf("missing normalized default port: %v", out) - } -} - -func TestEndpointFromMultiaddr(t *testing.T) { - ma, _ := multiaddr.NewMultiaddr("/ip4/127.0.0.1/tcp/4001") - if ep := endpointFromMultiaddr(ma, 5001); ep != "http://127.0.0.1:5001" { - t.Fatalf("unexpected endpoint: %s", ep) - } -} diff --git a/pkg/client/implementations.go b/pkg/client/implementations.go index ea06381..9dd61a8 100644 --- a/pkg/client/implementations.go +++ b/pkg/client/implementations.go @@ -6,6 +6,7 @@ import ( "strings" "sync" "time" + "unicode" "github.com/libp2p/go-libp2p/core/peer" "github.com/multiformats/go-multiaddr" @@ -14,9 +15,10 @@ import ( // DatabaseClientImpl implements DatabaseClient type DatabaseClientImpl struct { - client *Client - connection *gorqlite.Connection - mu sync.RWMutex + client *Client + connection *gorqlite.Connection + databaseName string // Empty for default database, or specific database name + mu sync.RWMutex } // checkConnection verifies the client is connected @@ -176,19 +178,17 @@ func (d *DatabaseClientImpl) getRQLiteConnection() (*gorqlite.Connection, error) // getRQLiteNodes returns a list of RQLite node URLs with precedence: // 1) client config DatabaseEndpoints // 2) RQLITE_NODES env (comma/space separated) -// 3) library defaults via DefaultDatabaseEndpoints() +// 3) library defaults via bootstrap peers func (d *DatabaseClientImpl) getRQLiteNodes() []string { // 1) Prefer explicit configuration on the client if d.client != nil && d.client.config != nil && len(d.client.config.DatabaseEndpoints) > 0 { - return dedupeStrings(normalizeEndpoints(d.client.config.DatabaseEndpoints)) + return d.client.config.DatabaseEndpoints } - // 3) Fallback to library defaults derived from bootstrap peers - return DefaultDatabaseEndpoints() + // 2) Return empty - dynamic clustering will determine endpoints + return []string{} } -// normalizeEndpoints is now imported from defaults.go - func hasPort(hostport string) bool { // cheap check for :port suffix (IPv6 with brackets handled by url.Parse earlier) if i := strings.LastIndex(hostport, ":"); i > -1 && i < len(hostport)-1 { @@ -392,6 +392,46 @@ func (d *DatabaseClientImpl) GetSchema(ctx context.Context) (*SchemaInfo, error) return schema, nil } +// Database returns a database client for the named database +// The database name is prefixed with the app name for isolation +func (d *DatabaseClientImpl) Database(name string) (DatabaseClient, error) { + if !d.client.isConnected() { + return nil, fmt.Errorf("client not connected") + } + + // Sanitize and prefix database name + appName := d.client.getAppNamespace() + fullDBName := sanitizeDatabaseName(appName, name) + + // Create a new database client instance for this specific database + dbClient := &DatabaseClientImpl{ + client: d.client, + databaseName: fullDBName, + } + + return dbClient, nil +} + +// sanitizeDatabaseName creates a sanitized database name with app prefix +func sanitizeDatabaseName(appName, dbName string) string { + sanitizedApp := sanitizeIdentifier(appName) + sanitizedDB := sanitizeIdentifier(dbName) + return fmt.Sprintf("%s_%s", sanitizedApp, sanitizedDB) +} + +// sanitizeIdentifier sanitizes an identifier (app or database name) +func sanitizeIdentifier(name string) string { + var result strings.Builder + for _, r := range name { + if unicode.IsLetter(r) || unicode.IsNumber(r) || r == '_' { + result.WriteRune(unicode.ToLower(r)) + } else if r == '-' || r == ' ' { + result.WriteRune('_') + } + } + return result.String() +} + // NetworkInfoImpl implements NetworkInfo type NetworkInfoImpl struct { client *Client diff --git a/pkg/client/interface.go b/pkg/client/interface.go index 328a0cd..9a4b6f2 100644 --- a/pkg/client/interface.go +++ b/pkg/client/interface.go @@ -2,7 +2,6 @@ package client import ( "context" - "fmt" "time" ) @@ -33,6 +32,11 @@ type DatabaseClient interface { CreateTable(ctx context.Context, schema string) error DropTable(ctx context.Context, tableName string) error GetSchema(ctx context.Context) (*SchemaInfo, error) + + // Multi-database support (NEW) + // Database returns a database client for the named database + // The database name will be prefixed with the app name for isolation + Database(name string) (DatabaseClient, error) } // PubSubClient provides publish/subscribe messaging @@ -120,23 +124,3 @@ type ClientConfig struct { APIKey string `json:"api_key"` // API key for gateway auth JWT string `json:"jwt"` // Optional JWT bearer token } - -// DefaultClientConfig returns a default client configuration -func DefaultClientConfig(appName string) *ClientConfig { - // Base defaults - peers := DefaultBootstrapPeers() - endpoints := DefaultDatabaseEndpoints() - - return &ClientConfig{ - AppName: appName, - DatabaseName: fmt.Sprintf("%s_db", appName), - BootstrapPeers: peers, - DatabaseEndpoints: endpoints, - ConnectTimeout: time.Second * 30, - RetryAttempts: 3, - RetryDelay: time.Second * 5, - QuietMode: false, - APIKey: "", - JWT: "", - } -} diff --git a/pkg/config/config.go b/pkg/config/config.go index 39bef1b..692f9d0 100644 --- a/pkg/config/config.go +++ b/pkg/config/config.go @@ -26,26 +26,29 @@ type NodeConfig struct { // DatabaseConfig contains database-related configuration type DatabaseConfig struct { - DataDir string `yaml:"data_dir"` ReplicationFactor int `yaml:"replication_factor"` ShardCount int `yaml:"shard_count"` MaxDatabaseSize int64 `yaml:"max_database_size"` // In bytes BackupInterval time.Duration `yaml:"backup_interval"` - // RQLite-specific configuration - RQLitePort int `yaml:"rqlite_port"` // RQLite HTTP API port - RQLiteRaftPort int `yaml:"rqlite_raft_port"` // RQLite Raft consensus port - RQLiteJoinAddress string `yaml:"rqlite_join_address"` // Address to join RQLite cluster + // Dynamic database clustering + HibernationTimeout time.Duration `yaml:"hibernation_timeout"` // Seconds before hibernation + MaxDatabases int `yaml:"max_databases"` // Max databases per node + PortRangeHTTPStart int `yaml:"port_range_http_start"` // HTTP port range start + PortRangeHTTPEnd int `yaml:"port_range_http_end"` // HTTP port range end + PortRangeRaftStart int `yaml:"port_range_raft_start"` // Raft port range start + PortRangeRaftEnd int `yaml:"port_range_raft_end"` // Raft port range end } // DiscoveryConfig contains peer discovery configuration type DiscoveryConfig struct { - BootstrapPeers []string `yaml:"bootstrap_peers"` // Bootstrap peer addresses - DiscoveryInterval time.Duration `yaml:"discovery_interval"` // Discovery announcement interval - BootstrapPort int `yaml:"bootstrap_port"` // Default port for bootstrap nodes - HttpAdvAddress string `yaml:"http_adv_address"` // HTTP advertisement address - RaftAdvAddress string `yaml:"raft_adv_address"` // Raft advertisement - NodeNamespace string `yaml:"node_namespace"` // Namespace for node identifiers + BootstrapPeers []string `yaml:"bootstrap_peers"` // Bootstrap peer addresses + DiscoveryInterval time.Duration `yaml:"discovery_interval"` // Discovery announcement interval + BootstrapPort int `yaml:"bootstrap_port"` // Default port for bootstrap nodes + HttpAdvAddress string `yaml:"http_adv_address"` // HTTP advertisement address + RaftAdvAddress string `yaml:"raft_adv_address"` // Raft advertisement + NodeNamespace string `yaml:"node_namespace"` // Namespace for node identifiers + HealthCheckInterval time.Duration `yaml:"health_check_interval"` // Health check interval for node monitoring } // SecurityConfig contains security-related configuration @@ -96,30 +99,34 @@ func DefaultConfig() *Config { MaxConnections: 50, }, Database: DatabaseConfig{ - DataDir: "./data/db", ReplicationFactor: 3, ShardCount: 16, MaxDatabaseSize: 1024 * 1024 * 1024, // 1GB BackupInterval: time.Hour * 24, // Daily backups - // RQLite-specific configuration - RQLitePort: 5001, - RQLiteRaftPort: 7001, - RQLiteJoinAddress: "", // Empty for bootstrap node + // Dynamic database clustering + HibernationTimeout: 60 * time.Second, + MaxDatabases: 100, + PortRangeHTTPStart: 5001, + PortRangeHTTPEnd: 5999, + PortRangeRaftStart: 7001, + PortRangeRaftEnd: 7999, }, Discovery: DiscoveryConfig{ BootstrapPeers: []string{ - "/ip4/217.76.54.168/tcp/4001/p2p/12D3KooWDp7xeShVY9uHfqNVPSsJeCKUatAviFZV8Y1joox5nUvx", - "/ip4/217.76.54.178/tcp/4001/p2p/12D3KooWKZnirPwNT4URtNSWK45f6vLkEs4xyUZ792F8Uj1oYnm1", - "/ip4/51.83.128.181/tcp/4001/p2p/12D3KooWBn2Zf1R8v9pEfmz7hDZ5b3oADxfejA3zJBYzKRCzgvhR", - "/ip4/155.133.27.199/tcp/4001/p2p/12D3KooWC69SBzM5QUgrLrfLWUykE8au32X5LwT7zwv9bixrQPm1", - "/ip4/217.76.56.2/tcp/4001/p2p/12D3KooWEiqJHvznxqJ5p2y8mUs6Ky6dfU1xTYFQbyKRCABfcZz4", + "/ip4/127.0.0.1/tcp/4001/p2p/12D3KooWKdj4B3LdZ8whYGaa97giwWCoSELciRp6qsFrDvz2Etah", + // "/ip4/217.76.54.168/tcp/4001/p2p/12D3KooWDp7xeShVY9uHfqNVPSsJeCKUatAviFZV8Y1joox5nUvx", + // "/ip4/217.76.54.178/tcp/4001/p2p/12D3KooWKZnirPwNT4URtNSWK45f6vLkEs4xyUZ792F8Uj1oYnm1", + // "/ip4/51.83.128.181/tcp/4001/p2p/12D3KooWBn2Zf1R8v9pEfmz7hDZ5b3oADxfejA3zJBYzKRCzgvhR", + // "/ip4/155.133.27.199/tcp/4001/p2p/12D3KooWC69SBzM5QUgrLrfLWUykE8au32X5LwT7zwv9bixrQPm1", + // "/ip4/217.76.56.2/tcp/4001/p2p/12D3KooWEiqJHvznxqJ5p2y8mUs6Ky6dfU1xTYFQbyKRCABfcZz4", }, - BootstrapPort: 4001, // Default LibP2P port - DiscoveryInterval: time.Second * 15, // Back to 15 seconds for testing - HttpAdvAddress: "", - RaftAdvAddress: "", - NodeNamespace: "default", + BootstrapPort: 4001, // Default LibP2P port + DiscoveryInterval: time.Second * 15, // Back to 15 seconds for testing + HttpAdvAddress: "", + RaftAdvAddress: "", + NodeNamespace: "default", + HealthCheckInterval: 10 * time.Second, // Health check interval }, Security: SecurityConfig{ EnableTLS: false, diff --git a/pkg/gateway/database_handlers.go b/pkg/gateway/database_handlers.go new file mode 100644 index 0000000..ee7b2a1 --- /dev/null +++ b/pkg/gateway/database_handlers.go @@ -0,0 +1,449 @@ +package gateway + +import ( + "context" + "encoding/json" + "fmt" + "net/http" + "strings" + "time" + + "github.com/DeBrosOfficial/network/pkg/logging" + "go.uber.org/zap" +) + +// Database request/response types + +type ExecRequest struct { + Database string `json:"database"` + SQL string `json:"sql"` + Args []interface{} `json:"args,omitempty"` +} + +type ExecResponse struct { + RowsAffected int64 `json:"rows_affected"` + LastInsertID int64 `json:"last_insert_id,omitempty"` + Error string `json:"error,omitempty"` +} + +type QueryRequest struct { + Database string `json:"database"` + SQL string `json:"sql"` + Args []interface{} `json:"args,omitempty"` +} + +type QueryResponse struct { + Items []map[string]interface{} `json:"items"` + Count int `json:"count"` + Error string `json:"error,omitempty"` +} + +type TransactionRequest struct { + Database string `json:"database"` + Queries []string `json:"queries"` +} + +type TransactionResponse struct { + Success bool `json:"success"` + Error string `json:"error,omitempty"` +} + +type CreateTableRequest struct { + Database string `json:"database"` + Schema string `json:"schema"` +} + +type DropTableRequest struct { + Database string `json:"database"` + TableName string `json:"table_name"` +} + +type SchemaResponse struct { + Tables []TableSchema `json:"tables"` + Error string `json:"error,omitempty"` +} + +type TableSchema struct { + Name string `json:"name"` + CreateSQL string `json:"create_sql"` + Columns []string `json:"columns,omitempty"` +} + +// Database handlers + +// databaseExecHandler handles SQL execution (INSERT, UPDATE, DELETE, DDL) +func (g *Gateway) databaseExecHandler(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodPost { + http.Error(w, "Method not allowed", http.StatusMethodNotAllowed) + return + } + + var req ExecRequest + if err := json.NewDecoder(r.Body).Decode(&req); err != nil { + g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "Invalid request body"}) + return + } + + if req.Database == "" { + g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "database field is required"}) + return + } + + if req.SQL == "" { + g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "sql field is required"}) + return + } + + // Get database client + db, err := g.client.Database().Database(req.Database) + if err != nil { + g.logger.ComponentError(logging.ComponentDatabase, "Failed to get database client", zap.Error(err)) + g.respondJSON(w, http.StatusInternalServerError, ExecResponse{Error: fmt.Sprintf("Failed to access database: %v", err)}) + return + } + + // Execute with timeout + ctx, cancel := context.WithTimeout(r.Context(), 30*time.Second) + defer cancel() + + // For simplicity, we'll use Query and check if it's a write operation + // In production, you'd want to detect write vs read and route accordingly + result, err := db.Query(ctx, req.SQL, req.Args...) + if err != nil { + g.logger.ComponentError(logging.ComponentDatabase, "Query execution failed", + zap.String("database", req.Database), + zap.Error(err)) + g.respondJSON(w, http.StatusInternalServerError, ExecResponse{Error: err.Error()}) + return + } + + // For exec operations, return affected rows + g.respondJSON(w, http.StatusOK, ExecResponse{ + RowsAffected: result.Count, + }) +} + +// databaseQueryHandler handles SELECT queries +func (g *Gateway) databaseQueryHandler(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodPost { + http.Error(w, "Method not allowed", http.StatusMethodNotAllowed) + return + } + + var req QueryRequest + if err := json.NewDecoder(r.Body).Decode(&req); err != nil { + g.respondJSON(w, http.StatusBadRequest, QueryResponse{Error: "Invalid request body"}) + return + } + + if req.Database == "" { + g.respondJSON(w, http.StatusBadRequest, QueryResponse{Error: "database field is required"}) + return + } + + if req.SQL == "" { + g.respondJSON(w, http.StatusBadRequest, QueryResponse{Error: "sql field is required"}) + return + } + + // Get database client + db, err := g.client.Database().Database(req.Database) + if err != nil { + g.logger.ComponentError(logging.ComponentDatabase, "Failed to get database client", zap.Error(err)) + g.respondJSON(w, http.StatusInternalServerError, QueryResponse{Error: fmt.Sprintf("Failed to access database: %v", err)}) + return + } + + // Execute with timeout + ctx, cancel := context.WithTimeout(r.Context(), 30*time.Second) + defer cancel() + + result, err := db.Query(ctx, req.SQL, req.Args...) + if err != nil { + g.logger.ComponentError(logging.ComponentDatabase, "Query execution failed", + zap.String("database", req.Database), + zap.Error(err)) + g.respondJSON(w, http.StatusInternalServerError, QueryResponse{Error: err.Error()}) + return + } + + // Convert result to map format + items := make([]map[string]interface{}, len(result.Rows)) + for i, row := range result.Rows { + item := make(map[string]interface{}) + for j, col := range result.Columns { + if j < len(row) { + item[col] = row[j] + } + } + items[i] = item + } + + g.respondJSON(w, http.StatusOK, QueryResponse{ + Items: items, + Count: len(items), + }) +} + +// databaseTransactionHandler handles atomic transactions +func (g *Gateway) databaseTransactionHandler(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodPost { + http.Error(w, "Method not allowed", http.StatusMethodNotAllowed) + return + } + + var req TransactionRequest + if err := json.NewDecoder(r.Body).Decode(&req); err != nil { + g.respondJSON(w, http.StatusBadRequest, TransactionResponse{Success: false, Error: "Invalid request body"}) + return + } + + if req.Database == "" { + g.respondJSON(w, http.StatusBadRequest, TransactionResponse{Success: false, Error: "database field is required"}) + return + } + + if len(req.Queries) == 0 { + g.respondJSON(w, http.StatusBadRequest, TransactionResponse{Success: false, Error: "queries field is required and must not be empty"}) + return + } + + // Get database client + db, err := g.client.Database().Database(req.Database) + if err != nil { + g.logger.ComponentError(logging.ComponentDatabase, "Failed to get database client", zap.Error(err)) + g.respondJSON(w, http.StatusInternalServerError, TransactionResponse{Success: false, Error: fmt.Sprintf("Failed to access database: %v", err)}) + return + } + + // Execute with timeout + ctx, cancel := context.WithTimeout(r.Context(), 60*time.Second) + defer cancel() + + err = db.Transaction(ctx, req.Queries) + if err != nil { + g.logger.ComponentError(logging.ComponentDatabase, "Transaction failed", + zap.String("database", req.Database), + zap.Error(err)) + g.respondJSON(w, http.StatusInternalServerError, TransactionResponse{Success: false, Error: err.Error()}) + return + } + + g.respondJSON(w, http.StatusOK, TransactionResponse{Success: true}) +} + +// databaseSchemaHandler returns database schema information +func (g *Gateway) databaseSchemaHandler(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodGet && r.Method != http.MethodPost { + http.Error(w, "Method not allowed", http.StatusMethodNotAllowed) + return + } + + // Support both GET with query param and POST with JSON body + var database string + if r.Method == http.MethodPost { + var req struct { + Database string `json:"database"` + } + if err := json.NewDecoder(r.Body).Decode(&req); err != nil { + g.respondJSON(w, http.StatusBadRequest, SchemaResponse{Error: "Invalid request body"}) + return + } + database = req.Database + } else { + database = r.URL.Query().Get("database") + } + + if database == "" { + g.respondJSON(w, http.StatusBadRequest, SchemaResponse{Error: "database parameter is required"}) + return + } + + // Get database client + db, err := g.client.Database().Database(database) + if err != nil { + g.logger.ComponentError(logging.ComponentDatabase, "Failed to get database client", zap.Error(err)) + g.respondJSON(w, http.StatusInternalServerError, SchemaResponse{Error: fmt.Sprintf("Failed to access database: %v", err)}) + return + } + + // Execute with timeout + ctx, cancel := context.WithTimeout(r.Context(), 30*time.Second) + defer cancel() + + schemaInfo, err := db.GetSchema(ctx) + if err != nil { + g.logger.ComponentError(logging.ComponentDatabase, "Failed to get schema", + zap.String("database", database), + zap.Error(err)) + g.respondJSON(w, http.StatusInternalServerError, SchemaResponse{Error: err.Error()}) + return + } + + // Convert to response format + tables := make([]TableSchema, len(schemaInfo.Tables)) + for i, table := range schemaInfo.Tables { + columns := make([]string, len(table.Columns)) + for j, col := range table.Columns { + columns[j] = col.Name + } + tables[i] = TableSchema{ + Name: table.Name, + Columns: columns, + } + } + + g.respondJSON(w, http.StatusOK, SchemaResponse{Tables: tables}) +} + +// databaseCreateTableHandler creates a new table +func (g *Gateway) databaseCreateTableHandler(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodPost { + http.Error(w, "Method not allowed", http.StatusMethodNotAllowed) + return + } + + var req CreateTableRequest + if err := json.NewDecoder(r.Body).Decode(&req); err != nil { + g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "Invalid request body"}) + return + } + + if req.Database == "" { + g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "database field is required"}) + return + } + + if req.Schema == "" { + g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "schema field is required"}) + return + } + + // Get database client + db, err := g.client.Database().Database(req.Database) + if err != nil { + g.logger.ComponentError(logging.ComponentDatabase, "Failed to get database client", zap.Error(err)) + g.respondJSON(w, http.StatusInternalServerError, ExecResponse{Error: fmt.Sprintf("Failed to access database: %v", err)}) + return + } + + // Execute with timeout + ctx, cancel := context.WithTimeout(r.Context(), 30*time.Second) + defer cancel() + + err = db.CreateTable(ctx, req.Schema) + if err != nil { + g.logger.ComponentError(logging.ComponentDatabase, "Failed to create table", + zap.String("database", req.Database), + zap.Error(err)) + g.respondJSON(w, http.StatusInternalServerError, ExecResponse{Error: err.Error()}) + return + } + + g.respondJSON(w, http.StatusOK, ExecResponse{RowsAffected: 0}) +} + +// databaseDropTableHandler drops a table +func (g *Gateway) databaseDropTableHandler(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodPost { + http.Error(w, "Method not allowed", http.StatusMethodNotAllowed) + return + } + + var req DropTableRequest + if err := json.NewDecoder(r.Body).Decode(&req); err != nil { + g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "Invalid request body"}) + return + } + + if req.Database == "" { + g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "database field is required"}) + return + } + + if req.TableName == "" { + g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "table_name field is required"}) + return + } + + // Validate table name (basic SQL injection prevention) + if !isValidIdentifier(req.TableName) { + g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "invalid table name"}) + return + } + + // Get database client + db, err := g.client.Database().Database(req.Database) + if err != nil { + g.logger.ComponentError(logging.ComponentDatabase, "Failed to get database client", zap.Error(err)) + g.respondJSON(w, http.StatusInternalServerError, ExecResponse{Error: fmt.Sprintf("Failed to access database: %v", err)}) + return + } + + // Execute with timeout + ctx, cancel := context.WithTimeout(r.Context(), 30*time.Second) + defer cancel() + + err = db.DropTable(ctx, req.TableName) + if err != nil { + g.logger.ComponentError(logging.ComponentDatabase, "Failed to drop table", + zap.String("database", req.Database), + zap.String("table", req.TableName), + zap.Error(err)) + g.respondJSON(w, http.StatusInternalServerError, ExecResponse{Error: err.Error()}) + return + } + + g.respondJSON(w, http.StatusOK, ExecResponse{RowsAffected: 0}) +} + +// databaseListHandler lists all available databases for the current app +func (g *Gateway) databaseListHandler(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodGet { + http.Error(w, "Method not allowed", http.StatusMethodNotAllowed) + return + } + + // TODO: This would require the ClusterManager to expose a list of databases + // For now, return a placeholder + g.respondJSON(w, http.StatusOK, map[string]interface{}{ + "databases": []string{}, + "message": "Database listing not yet implemented - query metadata store directly", + }) +} + +// Helper functions + +func (g *Gateway) respondJSON(w http.ResponseWriter, status int, data interface{}) { + w.Header().Set("Content-Type", "application/json") + w.WriteHeader(status) + if err := json.NewEncoder(w).Encode(data); err != nil { + g.logger.ComponentError(logging.ComponentGeneral, "Failed to encode JSON response", zap.Error(err)) + } +} + +func isValidIdentifier(name string) bool { + if len(name) == 0 || len(name) > 128 { + return false + } + // Only allow alphanumeric, underscore, and hyphen + for _, r := range name { + if !(r >= 'a' && r <= 'z') && !(r >= 'A' && r <= 'Z') && !(r >= '0' && r <= '9') && r != '_' && r != '-' { + return false + } + } + // Don't start with number + firstRune := []rune(name)[0] + if firstRune >= '0' && firstRune <= '9' { + return false + } + // Avoid SQL keywords + upperName := strings.ToUpper(name) + sqlKeywords := []string{"SELECT", "INSERT", "UPDATE", "DELETE", "DROP", "CREATE", "ALTER", "TABLE", "DATABASE", "INDEX"} + for _, keyword := range sqlKeywords { + if upperName == keyword { + return false + } + } + return true +} diff --git a/pkg/gateway/gateway.go b/pkg/gateway/gateway.go index 4e140ed..42945d3 100644 --- a/pkg/gateway/gateway.go +++ b/pkg/gateway/gateway.go @@ -4,16 +4,12 @@ import ( "context" "crypto/rand" "crypto/rsa" - "database/sql" "strconv" "time" "github.com/DeBrosOfficial/network/pkg/client" "github.com/DeBrosOfficial/network/pkg/logging" - "github.com/DeBrosOfficial/network/pkg/rqlite" "go.uber.org/zap" - - _ "github.com/rqlite/gorqlite/stdlib" ) // Config holds configuration for the gateway server @@ -34,11 +30,6 @@ type Gateway struct { startedAt time.Time signingKey *rsa.PrivateKey keyID string - - // rqlite SQL connection and HTTP ORM gateway - sqlDB *sql.DB - ormClient rqlite.Client - ormHTTP *rqlite.HTTPGateway } // New creates and initializes a new Gateway instance @@ -87,24 +78,7 @@ func New(logger *logging.ColoredLogger, cfg *Config) (*Gateway, error) { logger.ComponentWarn(logging.ComponentGeneral, "failed to generate RSA key; jwks will be empty", zap.Error(err)) } - logger.ComponentInfo(logging.ComponentGeneral, "Initializing RQLite ORM HTTP gateway...") - dsn := cfg.RQLiteDSN - if dsn == "" { - dsn = "http://localhost:4001" - } - db, dbErr := sql.Open("rqlite", dsn) - if dbErr != nil { - logger.ComponentWarn(logging.ComponentGeneral, "failed to open rqlite sql db; http orm gateway disabled", zap.Error(dbErr)) - } else { - gw.sqlDB = db - orm := rqlite.NewClient(db) - gw.ormClient = orm - gw.ormHTTP = rqlite.NewHTTPGateway(orm, "/v1/db") - logger.ComponentInfo(logging.ComponentGeneral, "RQLite ORM HTTP gateway ready", - zap.String("dsn", dsn), - zap.String("base_path", "/v1/db"), - ) - } + logger.ComponentInfo(logging.ComponentGeneral, "Gateway initialized with dynamic database clustering") logger.ComponentInfo(logging.ComponentGeneral, "Gateway creation completed, returning...") return gw, nil @@ -122,7 +96,5 @@ func (g *Gateway) Close() { g.logger.ComponentWarn(logging.ComponentClient, "error during client disconnect", zap.Error(err)) } } - if g.sqlDB != nil { - _ = g.sqlDB.Close() - } + // No legacy database connections to close } diff --git a/pkg/gateway/routes.go b/pkg/gateway/routes.go index 4ad7cc9..89f1b57 100644 --- a/pkg/gateway/routes.go +++ b/pkg/gateway/routes.go @@ -27,12 +27,6 @@ func (g *Gateway) Routes() http.Handler { mux.HandleFunc("/v1/auth/logout", g.logoutHandler) mux.HandleFunc("/v1/auth/whoami", g.whoamiHandler) - // rqlite ORM HTTP gateway (mounts /v1/rqlite/* endpoints) - if g.ormHTTP != nil { - g.ormHTTP.BasePath = "/v1/rqlite" - g.ormHTTP.RegisterRoutes(mux) - } - // network mux.HandleFunc("/v1/network/status", g.networkStatusHandler) mux.HandleFunc("/v1/network/peers", g.networkPeersHandler) @@ -44,5 +38,14 @@ func (g *Gateway) Routes() http.Handler { mux.HandleFunc("/v1/pubsub/publish", g.pubsubPublishHandler) mux.HandleFunc("/v1/pubsub/topics", g.pubsubTopicsHandler) + // database operations (dynamic clustering) + mux.HandleFunc("/v1/database/exec", g.databaseExecHandler) + mux.HandleFunc("/v1/database/query", g.databaseQueryHandler) + mux.HandleFunc("/v1/database/transaction", g.databaseTransactionHandler) + mux.HandleFunc("/v1/database/schema", g.databaseSchemaHandler) + mux.HandleFunc("/v1/database/create-table", g.databaseCreateTableHandler) + mux.HandleFunc("/v1/database/drop-table", g.databaseDropTableHandler) + mux.HandleFunc("/v1/database/list", g.databaseListHandler) + return g.withMiddleware(mux) } diff --git a/pkg/node/node.go b/pkg/node/node.go index fb1af01..3338e24 100644 --- a/pkg/node/node.go +++ b/pkg/node/node.go @@ -34,8 +34,8 @@ type Node struct { logger *logging.ColoredLogger host host.Host - rqliteManager *database.RQLiteManager - rqliteAdapter *database.RQLiteAdapter + // Dynamic database clustering + clusterManager *database.ClusterManager // Peer discovery discoveryCancel context.CancelFunc @@ -59,25 +59,26 @@ func NewNode(cfg *config.Config) (*Node, error) { }, nil } -// startRQLite initializes and starts the RQLite database -func (n *Node) startRQLite(ctx context.Context) error { - n.logger.Info("Starting RQLite database") +// startClusterManager initializes and starts the cluster manager for dynamic databases +func (n *Node) startClusterManager(ctx context.Context) error { + n.logger.Info("Starting dynamic database cluster manager") - // Create RQLite manager - n.rqliteManager = database.NewRQLiteManager(&n.config.Database, &n.config.Discovery, n.config.Node.DataDir, n.logger.Logger) + // Create cluster manager + n.clusterManager = database.NewClusterManager( + n.host.ID().String(), + &n.config.Database, + &n.config.Discovery, + n.config.Node.DataDir, + n.pubsub, + n.logger.Logger, + ) - // Start RQLite - if err := n.rqliteManager.Start(ctx); err != nil { - return err + // Start cluster manager + if err := n.clusterManager.Start(); err != nil { + return fmt.Errorf("failed to start cluster manager: %w", err) } - // Create adapter for sql.DB compatibility - adapter, err := database.NewRQLiteAdapter(n.rqliteManager) - if err != nil { - return fmt.Errorf("failed to create RQLite adapter: %w", err) - } - n.rqliteAdapter = adapter - + n.logger.Info("Dynamic database cluster manager started successfully") return nil } @@ -563,19 +564,18 @@ func (n *Node) Stop() error { // Stop peer discovery n.stopPeerDiscovery() + // Stop cluster manager + if n.clusterManager != nil { + if err := n.clusterManager.Stop(); err != nil { + n.logger.ComponentWarn(logging.ComponentNode, "Error stopping cluster manager", zap.Error(err)) + } + } + // Stop LibP2P host if n.host != nil { n.host.Close() } - // Stop RQLite - if n.rqliteAdapter != nil { - n.rqliteAdapter.Close() - } - if n.rqliteManager != nil { - _ = n.rqliteManager.Stop() - } - n.logger.ComponentInfo(logging.ComponentNode, "Network node stopped") return nil } @@ -589,16 +589,16 @@ func (n *Node) Start(ctx context.Context) error { return fmt.Errorf("failed to create data directory: %w", err) } - // Start RQLite - if err := n.startRQLite(ctx); err != nil { - return fmt.Errorf("failed to start RQLite: %w", err) - } - - // Start LibP2P host + // Start LibP2P host (required before cluster manager) if err := n.startLibP2P(); err != nil { return fmt.Errorf("failed to start LibP2P: %w", err) } + // Start cluster manager for dynamic databases + if err := n.startClusterManager(ctx); err != nil { + return fmt.Errorf("failed to start cluster manager: %w", err) + } + // Get listen addresses for logging var listenAddrs []string for _, addr := range n.host.Addrs() { diff --git a/pkg/rqlite/adapter.go b/pkg/rqlite/adapter.go deleted file mode 100644 index 81205bf..0000000 --- a/pkg/rqlite/adapter.go +++ /dev/null @@ -1,46 +0,0 @@ -package rqlite - -import ( - "database/sql" - "fmt" - - _ "github.com/rqlite/gorqlite/stdlib" // Import the database/sql driver -) - -// RQLiteAdapter adapts RQLite to the sql.DB interface -type RQLiteAdapter struct { - manager *RQLiteManager - db *sql.DB -} - -// NewRQLiteAdapter creates a new adapter that provides sql.DB interface for RQLite -func NewRQLiteAdapter(manager *RQLiteManager) (*RQLiteAdapter, error) { - // Use the gorqlite database/sql driver - db, err := sql.Open("rqlite", fmt.Sprintf("http://localhost:%d", manager.config.RQLitePort)) - if err != nil { - return nil, fmt.Errorf("failed to open RQLite SQL connection: %w", err) - } - - return &RQLiteAdapter{ - manager: manager, - db: db, - }, nil -} - -// GetSQLDB returns the sql.DB interface for compatibility with existing storage service -func (a *RQLiteAdapter) GetSQLDB() *sql.DB { - return a.db -} - -// GetManager returns the underlying RQLite manager for advanced operations -func (a *RQLiteAdapter) GetManager() *RQLiteManager { - return a.manager -} - -// Close closes the adapter connections -func (a *RQLiteAdapter) Close() error { - if a.db != nil { - a.db.Close() - } - return a.manager.Stop() -} diff --git a/pkg/rqlite/client.go b/pkg/rqlite/client.go deleted file mode 100644 index 70c78e2..0000000 --- a/pkg/rqlite/client.go +++ /dev/null @@ -1,835 +0,0 @@ -package rqlite - -// client.go defines the ORM-like interfaces and a minimal implementation over database/sql. -// It builds on the rqlite stdlib driver so it behaves like a regular SQL-backed ORM. - -import ( - "context" - "database/sql" - "errors" - "fmt" - "reflect" - "strings" - "time" -) - -// TableNamer lets a struct provide its table name. -type TableNamer interface { - TableName() string -} - -// Client is the high-level ORM-like API. -type Client interface { - // Query runs an arbitrary SELECT and scans rows into dest (pointer to slice of structs or []map[string]any). - Query(ctx context.Context, dest any, query string, args ...any) error - // Exec runs a write statement (INSERT/UPDATE/DELETE). - Exec(ctx context.Context, query string, args ...any) (sql.Result, error) - - // FindBy/FindOneBy provide simple map-based criteria filtering. - FindBy(ctx context.Context, dest any, table string, criteria map[string]any, opts ...FindOption) error - FindOneBy(ctx context.Context, dest any, table string, criteria map[string]any, opts ...FindOption) error - - // Save inserts or updates an entity (single-PK). - Save(ctx context.Context, entity any) error - // Remove deletes by PK (single-PK). - Remove(ctx context.Context, entity any) error - - // Repositories (generic layer). Optional but convenient if you use Go generics. - Repository(table string) any - - // Fluent query builder for advanced querying. - CreateQueryBuilder(table string) *QueryBuilder - - // Tx executes a function within a transaction. - Tx(ctx context.Context, fn func(tx Tx) error) error -} - -// Tx mirrors Client but executes within a transaction. -type Tx interface { - Query(ctx context.Context, dest any, query string, args ...any) error - Exec(ctx context.Context, query string, args ...any) (sql.Result, error) - CreateQueryBuilder(table string) *QueryBuilder - - // Optional: scoped Save/Remove inside tx - Save(ctx context.Context, entity any) error - Remove(ctx context.Context, entity any) error -} - -// Repository provides typed entity operations for a table. -type Repository[T any] interface { - Find(ctx context.Context, dest *[]T, criteria map[string]any, opts ...FindOption) error - FindOne(ctx context.Context, dest *T, criteria map[string]any, opts ...FindOption) error - Save(ctx context.Context, entity *T) error - Remove(ctx context.Context, entity *T) error - - // Builder helpers - Q() *QueryBuilder -} - -// NewClient wires the ORM client to a *sql.DB (from your RQLiteAdapter). -func NewClient(db *sql.DB) Client { - return &client{db: db} -} - -// NewClientFromAdapter is convenient if you already created the adapter. -func NewClientFromAdapter(adapter *RQLiteAdapter) Client { - return NewClient(adapter.GetSQLDB()) -} - -// client implements Client over *sql.DB. -type client struct { - db *sql.DB -} - -func (c *client) Query(ctx context.Context, dest any, query string, args ...any) error { - rows, err := c.db.QueryContext(ctx, query, args...) - if err != nil { - return err - } - defer rows.Close() - return scanIntoDest(rows, dest) -} - -func (c *client) Exec(ctx context.Context, query string, args ...any) (sql.Result, error) { - return c.db.ExecContext(ctx, query, args...) -} - -func (c *client) FindBy(ctx context.Context, dest any, table string, criteria map[string]any, opts ...FindOption) error { - qb := c.CreateQueryBuilder(table) - for k, v := range criteria { - qb = qb.AndWhere(fmt.Sprintf("%s = ?", k), v) - } - for _, opt := range opts { - opt(qb) - } - return qb.GetMany(ctx, dest) -} - -func (c *client) FindOneBy(ctx context.Context, dest any, table string, criteria map[string]any, opts ...FindOption) error { - qb := c.CreateQueryBuilder(table) - for k, v := range criteria { - qb = qb.AndWhere(fmt.Sprintf("%s = ?", k), v) - } - for _, opt := range opts { - opt(qb) - } - return qb.GetOne(ctx, dest) -} - -func (c *client) Save(ctx context.Context, entity any) error { - return saveEntity(ctx, c.db, entity) -} - -func (c *client) Remove(ctx context.Context, entity any) error { - return removeEntity(ctx, c.db, entity) -} - -func (c *client) Repository(table string) any { - // This returns an untyped interface since Go methods cannot have type parameters - // Users will need to type assert the result to Repository[T] - return func() any { - return &repository[any]{c: c, table: table} - }() -} - -func (c *client) CreateQueryBuilder(table string) *QueryBuilder { - return newQueryBuilder(c.db, table) -} - -func (c *client) Tx(ctx context.Context, fn func(tx Tx) error) error { - sqlTx, err := c.db.BeginTx(ctx, nil) - if err != nil { - return err - } - txc := &txClient{tx: sqlTx} - if err := fn(txc); err != nil { - _ = sqlTx.Rollback() - return err - } - return sqlTx.Commit() -} - -// txClient implements Tx over *sql.Tx. -type txClient struct { - tx *sql.Tx -} - -func (t *txClient) Query(ctx context.Context, dest any, query string, args ...any) error { - rows, err := t.tx.QueryContext(ctx, query, args...) - if err != nil { - return err - } - defer rows.Close() - return scanIntoDest(rows, dest) -} - -func (t *txClient) Exec(ctx context.Context, query string, args ...any) (sql.Result, error) { - return t.tx.ExecContext(ctx, query, args...) -} - -func (t *txClient) CreateQueryBuilder(table string) *QueryBuilder { - return newQueryBuilder(t.tx, table) -} - -func (t *txClient) Save(ctx context.Context, entity any) error { - return saveEntity(ctx, t.tx, entity) -} - -func (t *txClient) Remove(ctx context.Context, entity any) error { - return removeEntity(ctx, t.tx, entity) -} - -// executor is implemented by *sql.DB and *sql.Tx. -type executor interface { - QueryContext(ctx context.Context, query string, args ...any) (*sql.Rows, error) - ExecContext(ctx context.Context, query string, args ...any) (sql.Result, error) -} - -// QueryBuilder implements a fluent SELECT builder with joins, where, etc. -type QueryBuilder struct { - exec executor - table string - alias string - selects []string - - joins []joinClause - wheres []whereClause - - groupBys []string - orderBys []string - limit *int - offset *int -} - -// joinClause represents INNER/LEFT/etc joins. -type joinClause struct { - kind string // "INNER", "LEFT", "JOIN" (default) - table string - on string -} - -// whereClause holds an expression and args with a conjunction. -type whereClause struct { - conj string // "AND" or "OR" - expr string - args []any -} - -func newQueryBuilder(exec executor, table string) *QueryBuilder { - return &QueryBuilder{ - exec: exec, - table: table, - } -} - -func (qb *QueryBuilder) Select(cols ...string) *QueryBuilder { - qb.selects = append(qb.selects, cols...) - return qb -} - -func (qb *QueryBuilder) Alias(a string) *QueryBuilder { - qb.alias = a - return qb -} - -func (qb *QueryBuilder) Where(expr string, args ...any) *QueryBuilder { - return qb.AndWhere(expr, args...) -} - -func (qb *QueryBuilder) AndWhere(expr string, args ...any) *QueryBuilder { - qb.wheres = append(qb.wheres, whereClause{conj: "AND", expr: expr, args: args}) - return qb -} - -func (qb *QueryBuilder) OrWhere(expr string, args ...any) *QueryBuilder { - qb.wheres = append(qb.wheres, whereClause{conj: "OR", expr: expr, args: args}) - return qb -} - -func (qb *QueryBuilder) InnerJoin(table string, on string) *QueryBuilder { - qb.joins = append(qb.joins, joinClause{kind: "INNER", table: table, on: on}) - return qb -} - -func (qb *QueryBuilder) LeftJoin(table string, on string) *QueryBuilder { - qb.joins = append(qb.joins, joinClause{kind: "LEFT", table: table, on: on}) - return qb -} - -func (qb *QueryBuilder) Join(table string, on string) *QueryBuilder { - qb.joins = append(qb.joins, joinClause{kind: "JOIN", table: table, on: on}) - return qb -} - -func (qb *QueryBuilder) GroupBy(cols ...string) *QueryBuilder { - qb.groupBys = append(qb.groupBys, cols...) - return qb -} - -func (qb *QueryBuilder) OrderBy(exprs ...string) *QueryBuilder { - qb.orderBys = append(qb.orderBys, exprs...) - return qb -} - -func (qb *QueryBuilder) Limit(n int) *QueryBuilder { - qb.limit = &n - return qb -} - -func (qb *QueryBuilder) Offset(n int) *QueryBuilder { - qb.offset = &n - return qb -} - -// Build returns the SQL string and args for a SELECT. -func (qb *QueryBuilder) Build() (string, []any) { - cols := "*" - if len(qb.selects) > 0 { - cols = strings.Join(qb.selects, ", ") - } - base := fmt.Sprintf("SELECT %s FROM %s", cols, qb.table) - if qb.alias != "" { - base += " AS " + qb.alias - } - - args := make([]any, 0, 16) - for _, j := range qb.joins { - base += fmt.Sprintf(" %s JOIN %s ON %s", j.kind, j.table, j.on) - } - - if len(qb.wheres) > 0 { - base += " WHERE " - for i, w := range qb.wheres { - if i > 0 { - base += " " + w.conj + " " - } - base += "(" + w.expr + ")" - args = append(args, w.args...) - } - } - - if len(qb.groupBys) > 0 { - base += " GROUP BY " + strings.Join(qb.groupBys, ", ") - } - if len(qb.orderBys) > 0 { - base += " ORDER BY " + strings.Join(qb.orderBys, ", ") - } - if qb.limit != nil { - base += fmt.Sprintf(" LIMIT %d", *qb.limit) - } - if qb.offset != nil { - base += fmt.Sprintf(" OFFSET %d", *qb.offset) - } - return base, args -} - -// GetMany executes the built query and scans into dest (pointer to slice). -func (qb *QueryBuilder) GetMany(ctx context.Context, dest any) error { - sqlStr, args := qb.Build() - rows, err := qb.exec.QueryContext(ctx, sqlStr, args...) - if err != nil { - return err - } - defer rows.Close() - return scanIntoDest(rows, dest) -} - -// GetOne executes the built query and scans into dest (pointer to struct or map) with LIMIT 1. -func (qb *QueryBuilder) GetOne(ctx context.Context, dest any) error { - limit := 1 - if qb.limit == nil { - qb.limit = &limit - } else if qb.limit != nil && *qb.limit > 1 { - qb.limit = &limit - } - sqlStr, args := qb.Build() - rows, err := qb.exec.QueryContext(ctx, sqlStr, args...) - if err != nil { - return err - } - defer rows.Close() - if !rows.Next() { - return sql.ErrNoRows - } - return scanIntoSingle(rows, dest) -} - -// FindOption customizes Find queries. -type FindOption func(q *QueryBuilder) - -func WithOrderBy(exprs ...string) FindOption { - return func(q *QueryBuilder) { q.OrderBy(exprs...) } -} -func WithGroupBy(cols ...string) FindOption { - return func(q *QueryBuilder) { q.GroupBy(cols...) } -} -func WithLimit(n int) FindOption { - return func(q *QueryBuilder) { q.Limit(n) } -} -func WithOffset(n int) FindOption { - return func(q *QueryBuilder) { q.Offset(n) } -} -func WithSelect(cols ...string) FindOption { - return func(q *QueryBuilder) { q.Select(cols...) } -} -func WithJoin(kind, table, on string) FindOption { - return func(q *QueryBuilder) { - switch strings.ToUpper(kind) { - case "INNER": - q.InnerJoin(table, on) - case "LEFT": - q.LeftJoin(table, on) - default: - q.Join(table, on) - } - } -} - -// repository is a generic table repository for type T. -type repository[T any] struct { - c *client - table string -} - -func (r *repository[T]) Find(ctx context.Context, dest *[]T, criteria map[string]any, opts ...FindOption) error { - qb := r.c.CreateQueryBuilder(r.table) - for k, v := range criteria { - qb.AndWhere(fmt.Sprintf("%s = ?", k), v) - } - for _, opt := range opts { - opt(qb) - } - return qb.GetMany(ctx, dest) -} - -func (r *repository[T]) FindOne(ctx context.Context, dest *T, criteria map[string]any, opts ...FindOption) error { - qb := r.c.CreateQueryBuilder(r.table) - for k, v := range criteria { - qb.AndWhere(fmt.Sprintf("%s = ?", k), v) - } - for _, opt := range opts { - opt(qb) - } - return qb.GetOne(ctx, dest) -} - -func (r *repository[T]) Save(ctx context.Context, entity *T) error { - return saveEntity(ctx, r.c.db, entity) -} - -func (r *repository[T]) Remove(ctx context.Context, entity *T) error { - return removeEntity(ctx, r.c.db, entity) -} - -func (r *repository[T]) Q() *QueryBuilder { - return r.c.CreateQueryBuilder(r.table) -} - -// ----------------------- -// Reflection + scanning -// ----------------------- - -func scanIntoDest(rows *sql.Rows, dest any) error { - // dest must be pointer to slice (of struct or map) - rv := reflect.ValueOf(dest) - if rv.Kind() != reflect.Pointer || rv.IsNil() { - return errors.New("dest must be a non-nil pointer") - } - sliceVal := rv.Elem() - if sliceVal.Kind() != reflect.Slice { - return errors.New("dest must be pointer to a slice") - } - elemType := sliceVal.Type().Elem() - - cols, err := rows.Columns() - if err != nil { - return err - } - - for rows.Next() { - itemPtr := reflect.New(elemType) - // Support map[string]any and struct - if elemType.Kind() == reflect.Map { - m, err := scanRowToMap(rows, cols) - if err != nil { - return err - } - sliceVal.Set(reflect.Append(sliceVal, reflect.ValueOf(m))) - continue - } - - if elemType.Kind() == reflect.Struct { - if err := scanCurrentRowIntoStruct(rows, cols, itemPtr.Elem()); err != nil { - return err - } - sliceVal.Set(reflect.Append(sliceVal, itemPtr.Elem())) - continue - } - - return fmt.Errorf("unsupported slice element type: %s", elemType.Kind()) - } - return rows.Err() -} - -func scanIntoSingle(rows *sql.Rows, dest any) error { - rv := reflect.ValueOf(dest) - if rv.Kind() != reflect.Pointer || rv.IsNil() { - return errors.New("dest must be a non-nil pointer") - } - cols, err := rows.Columns() - if err != nil { - return err - } - - switch rv.Elem().Kind() { - case reflect.Map: - m, err := scanRowToMap(rows, cols) - if err != nil { - return err - } - rv.Elem().Set(reflect.ValueOf(m)) - return nil - case reflect.Struct: - return scanCurrentRowIntoStruct(rows, cols, rv.Elem()) - default: - return fmt.Errorf("unsupported dest kind: %s", rv.Elem().Kind()) - } -} - -func scanRowToMap(rows *sql.Rows, cols []string) (map[string]any, error) { - raw := make([]any, len(cols)) - ptrs := make([]any, len(cols)) - for i := range raw { - ptrs[i] = &raw[i] - } - if err := rows.Scan(ptrs...); err != nil { - return nil, err - } - out := make(map[string]any, len(cols)) - for i, c := range cols { - out[c] = normalizeSQLValue(raw[i]) - } - return out, nil -} - -func scanCurrentRowIntoStruct(rows *sql.Rows, cols []string, destStruct reflect.Value) error { - raw := make([]any, len(cols)) - ptrs := make([]any, len(cols)) - for i := range raw { - ptrs[i] = &raw[i] - } - if err := rows.Scan(ptrs...); err != nil { - return err - } - fieldIndex := buildFieldIndex(destStruct.Type()) - for i, c := range cols { - if idx, ok := fieldIndex[strings.ToLower(c)]; ok { - field := destStruct.Field(idx) - if field.CanSet() { - if err := setReflectValue(field, raw[i]); err != nil { - return fmt.Errorf("column %s: %w", c, err) - } - } - } - } - return nil -} - -func normalizeSQLValue(v any) any { - switch t := v.(type) { - case []byte: - return string(t) - default: - return v - } -} - -func buildFieldIndex(t reflect.Type) map[string]int { - m := make(map[string]int) - for i := 0; i < t.NumField(); i++ { - f := t.Field(i) - if f.IsExported() == false { - continue - } - tag := f.Tag.Get("db") - col := "" - if tag != "" { - col = strings.Split(tag, ",")[0] - } - if col == "" { - col = f.Name - } - m[strings.ToLower(col)] = i - } - return m -} - -func setReflectValue(field reflect.Value, raw any) error { - if raw == nil { - // leave zero value - return nil - } - switch field.Kind() { - case reflect.String: - switch v := raw.(type) { - case string: - field.SetString(v) - case []byte: - field.SetString(string(v)) - default: - field.SetString(fmt.Sprint(v)) - } - case reflect.Bool: - switch v := raw.(type) { - case bool: - field.SetBool(v) - case int64: - field.SetBool(v != 0) - case []byte: - s := string(v) - field.SetBool(s == "1" || strings.EqualFold(s, "true")) - default: - field.SetBool(false) - } - case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64: - switch v := raw.(type) { - case int64: - field.SetInt(v) - case []byte: - var n int64 - fmt.Sscan(string(v), &n) - field.SetInt(n) - default: - return fmt.Errorf("cannot convert %T to int", raw) - } - case reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64: - switch v := raw.(type) { - case int64: - if v < 0 { - v = 0 - } - field.SetUint(uint64(v)) - case []byte: - var n uint64 - fmt.Sscan(string(v), &n) - field.SetUint(n) - default: - return fmt.Errorf("cannot convert %T to uint", raw) - } - case reflect.Float32, reflect.Float64: - switch v := raw.(type) { - case float64: - field.SetFloat(v) - case []byte: - var fv float64 - fmt.Sscan(string(v), &fv) - field.SetFloat(fv) - default: - return fmt.Errorf("cannot convert %T to float", raw) - } - case reflect.Struct: - // Support time.Time; extend as needed. - if field.Type() == reflect.TypeOf(time.Time{}) { - switch v := raw.(type) { - case time.Time: - field.Set(reflect.ValueOf(v)) - case []byte: - // Try RFC3339 - if tt, err := time.Parse(time.RFC3339, string(v)); err == nil { - field.Set(reflect.ValueOf(tt)) - } - } - return nil - } - fallthrough - default: - // Not supported yet - return fmt.Errorf("unsupported dest field kind: %s", field.Kind()) - } - return nil -} - -// ----------------------- -// Save/Remove (basic PK) -// ----------------------- - -type fieldMeta struct { - index int - column string - isPK bool - auto bool -} - -func collectMeta(t reflect.Type) (fields []fieldMeta, pk fieldMeta, hasPK bool) { - for i := 0; i < t.NumField(); i++ { - f := t.Field(i) - if !f.IsExported() { - continue - } - tag := f.Tag.Get("db") - if tag == "-" { - continue - } - opts := strings.Split(tag, ",") - col := opts[0] - if col == "" { - col = f.Name - } - meta := fieldMeta{index: i, column: col} - for _, o := range opts[1:] { - switch strings.ToLower(strings.TrimSpace(o)) { - case "pk": - meta.isPK = true - case "auto", "autoincrement": - meta.auto = true - } - } - // If not tagged as pk, fallback to field name "ID" - if !meta.isPK && f.Name == "ID" { - meta.isPK = true - if col == "" { - meta.column = "id" - } - } - fields = append(fields, meta) - if meta.isPK { - pk = meta - hasPK = true - } - } - return -} - -func getTableNameFromEntity(v reflect.Value) (string, bool) { - // If entity implements TableNamer - if v.CanInterface() { - if tn, ok := v.Interface().(TableNamer); ok { - return tn.TableName(), true - } - } - // Fallback: very naive pluralization (append 's') - typ := v.Type() - if typ.Kind() == reflect.Pointer { - typ = typ.Elem() - } - if typ.Kind() == reflect.Struct { - return strings.ToLower(typ.Name()) + "s", true - } - return "", false -} - -func saveEntity(ctx context.Context, exec executor, entity any) error { - rv := reflect.ValueOf(entity) - if rv.Kind() != reflect.Pointer || rv.IsNil() { - return errors.New("entity must be a non-nil pointer to struct") - } - ev := rv.Elem() - if ev.Kind() != reflect.Struct { - return errors.New("entity must point to a struct") - } - - fields, pkMeta, hasPK := collectMeta(ev.Type()) - if !hasPK { - return errors.New("no primary key field found (tag db:\"...,pk\" or field named ID)") - } - table, ok := getTableNameFromEntity(ev) - if !ok || table == "" { - return errors.New("unable to resolve table name; implement TableNamer or set up a repository with explicit table") - } - - // Build lists - cols := make([]string, 0, len(fields)) - vals := make([]any, 0, len(fields)) - setParts := make([]string, 0, len(fields)) - - var pkVal any - var pkIsZero bool - - for _, fm := range fields { - f := ev.Field(fm.index) - if fm.isPK { - pkVal = f.Interface() - pkIsZero = isZeroValue(f) - continue - } - cols = append(cols, fm.column) - vals = append(vals, f.Interface()) - setParts = append(setParts, fmt.Sprintf("%s = ?", fm.column)) - } - - if pkIsZero { - // INSERT - placeholders := strings.Repeat("?,", len(cols)) - if len(placeholders) > 0 { - placeholders = placeholders[:len(placeholders)-1] - } - sqlStr := fmt.Sprintf("INSERT INTO %s (%s) VALUES (%s)", table, strings.Join(cols, ", "), placeholders) - res, err := exec.ExecContext(ctx, sqlStr, vals...) - if err != nil { - return err - } - // Set auto ID if needed - if pkMeta.auto { - if id, err := res.LastInsertId(); err == nil { - ev.Field(pkMeta.index).SetInt(id) - } - } - return nil - } - - // UPDATE ... WHERE pk = ? - sqlStr := fmt.Sprintf("UPDATE %s SET %s WHERE %s = ?", table, strings.Join(setParts, ", "), pkMeta.column) - valsWithPK := append(vals, pkVal) - _, err := exec.ExecContext(ctx, sqlStr, valsWithPK...) - return err -} - -func removeEntity(ctx context.Context, exec executor, entity any) error { - rv := reflect.ValueOf(entity) - if rv.Kind() != reflect.Pointer || rv.IsNil() { - return errors.New("entity must be a non-nil pointer to struct") - } - ev := rv.Elem() - if ev.Kind() != reflect.Struct { - return errors.New("entity must point to a struct") - } - _, pkMeta, hasPK := collectMeta(ev.Type()) - if !hasPK { - return errors.New("no primary key field found") - } - table, ok := getTableNameFromEntity(ev) - if !ok || table == "" { - return errors.New("unable to resolve table name") - } - pkVal := ev.Field(pkMeta.index).Interface() - sqlStr := fmt.Sprintf("DELETE FROM %s WHERE %s = ?", table, pkMeta.column) - _, err := exec.ExecContext(ctx, sqlStr, pkVal) - return err -} - -func isZeroValue(v reflect.Value) bool { - switch v.Kind() { - case reflect.String: - return v.Len() == 0 - case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64: - return v.Int() == 0 - case reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64: - return v.Uint() == 0 - case reflect.Bool: - return v.Bool() == false - case reflect.Pointer, reflect.Interface: - return v.IsNil() - case reflect.Slice, reflect.Map: - return v.Len() == 0 - case reflect.Struct: - // Special-case time.Time - if v.Type() == reflect.TypeOf(time.Time{}) { - t := v.Interface().(time.Time) - return t.IsZero() - } - zero := reflect.Zero(v.Type()) - return reflect.DeepEqual(v.Interface(), zero.Interface()) - default: - return false - } -} diff --git a/pkg/rqlite/cluster_handlers.go b/pkg/rqlite/cluster_handlers.go new file mode 100644 index 0000000..cfb4b2a --- /dev/null +++ b/pkg/rqlite/cluster_handlers.go @@ -0,0 +1,902 @@ +package rqlite + +import ( + "context" + "fmt" + "time" + + "go.uber.org/zap" +) + +// handleCreateRequest processes a database creation request +func (cm *ClusterManager) handleCreateRequest(msg *MetadataMessage) error { + var req DatabaseCreateRequest + if err := msg.UnmarshalPayload(&req); err != nil { + return err + } + + cm.logger.Info("Received database create request", + zap.String("database", req.DatabaseName), + zap.String("requester", req.RequesterNodeID), + zap.Int("replication_factor", req.ReplicationFactor)) + + // Check if we can host this database + cm.mu.RLock() + currentCount := len(cm.activeClusters) + cm.mu.RUnlock() + + if currentCount >= cm.config.MaxDatabases { + cm.logger.Debug("Cannot host database: at capacity", + zap.String("database", req.DatabaseName), + zap.Int("current", currentCount), + zap.Int("max", cm.config.MaxDatabases)) + return nil + } + + // Allocate ports + ports, err := cm.portManager.AllocatePortPair(req.DatabaseName) + if err != nil { + cm.logger.Warn("Cannot allocate ports for database", + zap.String("database", req.DatabaseName), + zap.Error(err)) + return nil + } + + // Send response offering to host + response := DatabaseCreateResponse{ + DatabaseName: req.DatabaseName, + NodeID: cm.nodeID, + AvailablePorts: ports, + } + + msgData, err := MarshalMetadataMessage(MsgDatabaseCreateResponse, cm.nodeID, response) + if err != nil { + cm.portManager.ReleasePortPair(ports) + return fmt.Errorf("failed to marshal create response: %w", err) + } + + topic := "/debros/metadata/v1" + if err := cm.pubsubAdapter.Publish(cm.ctx, topic, msgData); err != nil { + cm.portManager.ReleasePortPair(ports) + return fmt.Errorf("failed to publish create response: %w", err) + } + + cm.logger.Info("Sent database create response", + zap.String("database", req.DatabaseName), + zap.Int("http_port", ports.HTTPPort), + zap.Int("raft_port", ports.RaftPort)) + + return nil +} + +// handleCreateResponse processes a database creation response +func (cm *ClusterManager) handleCreateResponse(msg *MetadataMessage) error { + var response DatabaseCreateResponse + if err := msg.UnmarshalPayload(&response); err != nil { + return err + } + + cm.logger.Debug("Received database create response", + zap.String("database", response.DatabaseName), + zap.String("node", response.NodeID)) + + // Forward to coordinator registry + cm.coordinatorRegistry.HandleCreateResponse(response) + + return nil +} + +// handleCreateConfirm processes a database creation confirmation +func (cm *ClusterManager) handleCreateConfirm(msg *MetadataMessage) error { + var confirm DatabaseCreateConfirm + if err := msg.UnmarshalPayload(&confirm); err != nil { + return err + } + + cm.logger.Info("Received database create confirm", + zap.String("database", confirm.DatabaseName), + zap.String("coordinator", confirm.CoordinatorNodeID), + zap.Int("nodes", len(confirm.SelectedNodes))) + + // Check if this node was selected + var myAssignment *NodeAssignment + for i, node := range confirm.SelectedNodes { + if node.NodeID == cm.nodeID { + myAssignment = &confirm.SelectedNodes[i] + break + } + } + + if myAssignment == nil { + cm.logger.Debug("Not selected for this database", + zap.String("database", confirm.DatabaseName)) + return nil + } + + cm.logger.Info("Selected to host database", + zap.String("database", confirm.DatabaseName), + zap.String("role", myAssignment.Role)) + + // Create database metadata + portMappings := make(map[string]PortPair) + nodeIDs := make([]string, len(confirm.SelectedNodes)) + for i, node := range confirm.SelectedNodes { + nodeIDs[i] = node.NodeID + portMappings[node.NodeID] = PortPair{ + HTTPPort: node.HTTPPort, + RaftPort: node.RaftPort, + } + } + + metadata := &DatabaseMetadata{ + DatabaseName: confirm.DatabaseName, + NodeIDs: nodeIDs, + PortMappings: portMappings, + Status: StatusInitializing, + CreatedAt: time.Now(), + LastAccessed: time.Now(), + LeaderNodeID: confirm.SelectedNodes[0].NodeID, // First node is leader + Version: 1, + VectorClock: NewVectorClock(), + } + + // Update vector clock + UpdateDatabaseMetadata(metadata, cm.nodeID) + + // Store metadata + cm.metadataStore.SetDatabase(metadata) + + // Start the RQLite instance + go cm.startDatabaseInstance(metadata, myAssignment.Role == "leader") + + return nil +} + +// startDatabaseInstance starts a database instance on this node +func (cm *ClusterManager) startDatabaseInstance(metadata *DatabaseMetadata, isLeader bool) { + ports := metadata.PortMappings[cm.nodeID] + + // Create advertised addresses + advHTTPAddr := fmt.Sprintf("%s:%d", cm.getAdvertiseAddress(), ports.HTTPPort) + advRaftAddr := fmt.Sprintf("%s:%d", cm.getAdvertiseAddress(), ports.RaftPort) + + // Create instance + instance := NewRQLiteInstance( + metadata.DatabaseName, + ports, + cm.dataDir, + advHTTPAddr, + advRaftAddr, + cm.logger, + ) + + // Determine join address (if follower) + var joinAddr string + if !isLeader && len(metadata.NodeIDs) > 0 { + // Join to the leader + leaderNodeID := metadata.LeaderNodeID + if leaderPorts, exists := metadata.PortMappings[leaderNodeID]; exists { + joinAddr = fmt.Sprintf("%s:%d", cm.getAdvertiseAddress(), leaderPorts.RaftPort) + } + } + + // Start the instance + ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) + defer cancel() + + if err := instance.Start(ctx, isLeader, joinAddr); err != nil { + cm.logger.Error("Failed to start database instance", + zap.String("database", metadata.DatabaseName), + zap.Error(err)) + + // Broadcast failure status + cm.broadcastStatusUpdate(metadata.DatabaseName, StatusInitializing) + return + } + + // Store active instance + cm.mu.Lock() + cm.activeClusters[metadata.DatabaseName] = instance + cm.mu.Unlock() + + // Broadcast active status + cm.broadcastStatusUpdate(metadata.DatabaseName, StatusActive) + + cm.logger.Info("Database instance started and active", + zap.String("database", metadata.DatabaseName)) +} + +// handleStatusUpdate processes database status updates +func (cm *ClusterManager) handleStatusUpdate(msg *MetadataMessage) error { + var update DatabaseStatusUpdate + if err := msg.UnmarshalPayload(&update); err != nil { + return err + } + + cm.logger.Debug("Received status update", + zap.String("database", update.DatabaseName), + zap.String("node", update.NodeID), + zap.String("status", string(update.Status))) + + // Update metadata + if metadata := cm.metadataStore.GetDatabase(update.DatabaseName); metadata != nil { + metadata.Status = update.Status + metadata.LastAccessed = time.Now() + cm.metadataStore.SetDatabase(metadata) + } + + return nil +} + +// handleCapacityAnnouncement processes node capacity announcements +func (cm *ClusterManager) handleCapacityAnnouncement(msg *MetadataMessage) error { + var announcement NodeCapacityAnnouncement + if err := msg.UnmarshalPayload(&announcement); err != nil { + return err + } + + capacity := &NodeCapacity{ + NodeID: announcement.NodeID, + MaxDatabases: announcement.MaxDatabases, + CurrentDatabases: announcement.CurrentDatabases, + PortRangeHTTP: announcement.PortRangeHTTP, + PortRangeRaft: announcement.PortRangeRaft, + LastHealthCheck: time.Now(), + IsHealthy: true, + } + + cm.metadataStore.SetNode(capacity) + + return nil +} + +// handleHealthPing processes health ping messages +func (cm *ClusterManager) handleHealthPing(msg *MetadataMessage) error { + var ping NodeHealthPing + if err := msg.UnmarshalPayload(&ping); err != nil { + return err + } + + // Respond with pong + pong := NodeHealthPong{ + NodeID: cm.nodeID, + Healthy: true, + PingFrom: ping.NodeID, + } + + msgData, err := MarshalMetadataMessage(MsgNodeHealthPong, cm.nodeID, pong) + if err != nil { + return err + } + + topic := "/debros/metadata/v1" + return cm.pubsubAdapter.Publish(cm.ctx, topic, msgData) +} + +// handleMetadataSync processes metadata synchronization messages +func (cm *ClusterManager) handleMetadataSync(msg *MetadataMessage) error { + var sync MetadataSync + if err := msg.UnmarshalPayload(&sync); err != nil { + return err + } + + if sync.Metadata == nil { + return nil + } + + // Check if we need to update local metadata + existing := cm.metadataStore.GetDatabase(sync.Metadata.DatabaseName) + if existing == nil { + // New database we didn't know about + cm.metadataStore.SetDatabase(sync.Metadata) + cm.logger.Info("Learned about new database via sync", + zap.String("database", sync.Metadata.DatabaseName)) + return nil + } + + // Resolve conflict if versions differ + winner := ResolveConflict(existing, sync.Metadata) + if winner != existing { + cm.metadataStore.SetDatabase(winner) + cm.logger.Info("Updated database metadata via sync", + zap.String("database", sync.Metadata.DatabaseName)) + } + + return nil +} + +// handleChecksumRequest processes checksum requests +func (cm *ClusterManager) handleChecksumRequest(msg *MetadataMessage) error { + var req MetadataChecksumRequest + if err := msg.UnmarshalPayload(&req); err != nil { + return err + } + + // Compute checksums for all databases + checksums := ComputeFullStateChecksum(cm.metadataStore) + + // Send response + response := MetadataChecksumResponse{ + RequestID: req.RequestID, + Checksums: checksums, + } + + msgData, err := MarshalMetadataMessage(MsgMetadataChecksumRes, cm.nodeID, response) + if err != nil { + return err + } + + topic := "/debros/metadata/v1" + return cm.pubsubAdapter.Publish(cm.ctx, topic, msgData) +} + +// handleChecksumResponse processes checksum responses +func (cm *ClusterManager) handleChecksumResponse(msg *MetadataMessage) error { + var response MetadataChecksumResponse + if err := msg.UnmarshalPayload(&response); err != nil { + return err + } + + // Compare with local checksums + localChecksums := ComputeFullStateChecksum(cm.metadataStore) + localMap := make(map[string]MetadataChecksum) + for _, cs := range localChecksums { + localMap[cs.DatabaseName] = cs + } + + // Check for differences + for _, remoteCS := range response.Checksums { + localCS, exists := localMap[remoteCS.DatabaseName] + if !exists { + // Database we don't know about - request full metadata + cm.logger.Info("Discovered database via checksum", + zap.String("database", remoteCS.DatabaseName)) + // TODO: Request full metadata for this database + continue + } + + if localCS.Hash != remoteCS.Hash { + cm.logger.Info("Database metadata diverged", + zap.String("database", remoteCS.DatabaseName)) + // TODO: Request full metadata for this database + } + } + + return nil +} + +// broadcastStatusUpdate broadcasts a status update for a database +func (cm *ClusterManager) broadcastStatusUpdate(dbName string, status DatabaseStatus) { + cm.mu.RLock() + instance := cm.activeClusters[dbName] + cm.mu.RUnlock() + + update := DatabaseStatusUpdate{ + DatabaseName: dbName, + NodeID: cm.nodeID, + Status: status, + } + + if instance != nil { + update.HTTPPort = instance.HTTPPort + update.RaftPort = instance.RaftPort + } + + msgData, err := MarshalMetadataMessage(MsgDatabaseStatusUpdate, cm.nodeID, update) + if err != nil { + cm.logger.Warn("Failed to marshal status update", zap.Error(err)) + return + } + + topic := "/debros/metadata/v1" + if err := cm.pubsubAdapter.Publish(cm.ctx, topic, msgData); err != nil { + cm.logger.Warn("Failed to publish status update", zap.Error(err)) + } +} + +// getAdvertiseAddress returns the advertise address for this node +func (cm *ClusterManager) getAdvertiseAddress() string { + if cm.discoveryConfig.HttpAdvAddress != "" { + // Extract just the host part (remove port if present) + addr := cm.discoveryConfig.HttpAdvAddress + if idx := len(addr) - 1; idx >= 0 { + for i := len(addr) - 1; i >= 0; i-- { + if addr[i] == ':' { + return addr[:i] + } + } + } + return addr + } + return "127.0.0.1" +} + +// handleIdleNotification processes idle notifications from other nodes +func (cm *ClusterManager) handleIdleNotification(msg *MetadataMessage) error { + var notification DatabaseIdleNotification + if err := msg.UnmarshalPayload(¬ification); err != nil { + return err + } + + cm.logger.Debug("Received idle notification", + zap.String("database", notification.DatabaseName), + zap.String("from_node", notification.NodeID)) + + // Get database metadata + dbMeta := cm.metadataStore.GetDatabase(notification.DatabaseName) + if dbMeta == nil { + cm.logger.Debug("Idle notification for unknown database", + zap.String("database", notification.DatabaseName)) + return nil + } + + // Track idle count (simple approach: if we see idle from all nodes, coordinate shutdown) + // In production, this would use a more sophisticated quorum mechanism + idleCount := 0 + for _, nodeID := range dbMeta.NodeIDs { + if nodeID == notification.NodeID || nodeID == cm.nodeID { + idleCount++ + } + } + + // If all nodes are idle, coordinate shutdown + if idleCount >= len(dbMeta.NodeIDs) { + cm.logger.Info("All nodes idle for database, coordinating shutdown", + zap.String("database", notification.DatabaseName)) + + // Elect coordinator + coordinator := SelectCoordinator(dbMeta.NodeIDs) + if coordinator == cm.nodeID { + // This node is coordinator, initiate shutdown + shutdown := DatabaseShutdownCoordinated{ + DatabaseName: notification.DatabaseName, + ShutdownTime: time.Now().Add(5 * time.Second), // Grace period + } + + msgData, err := MarshalMetadataMessage(MsgDatabaseShutdownCoordinated, cm.nodeID, shutdown) + if err != nil { + return fmt.Errorf("failed to marshal shutdown message: %w", err) + } + + topic := "/debros/metadata/v1" + if err := cm.pubsubAdapter.Publish(cm.ctx, topic, msgData); err != nil { + return fmt.Errorf("failed to publish shutdown message: %w", err) + } + + cm.logger.Info("Coordinated shutdown message sent", + zap.String("database", notification.DatabaseName)) + } + } + + return nil +} + +// handleShutdownCoordinated processes coordinated shutdown messages +func (cm *ClusterManager) handleShutdownCoordinated(msg *MetadataMessage) error { + var shutdown DatabaseShutdownCoordinated + if err := msg.UnmarshalPayload(&shutdown); err != nil { + return err + } + + cm.logger.Info("Received coordinated shutdown", + zap.String("database", shutdown.DatabaseName), + zap.Time("shutdown_time", shutdown.ShutdownTime)) + + // Get database metadata + dbMeta := cm.metadataStore.GetDatabase(shutdown.DatabaseName) + if dbMeta == nil { + cm.logger.Debug("Shutdown for unknown database", + zap.String("database", shutdown.DatabaseName)) + return nil + } + + // Check if this node is a member + isMember := false + for _, nodeID := range dbMeta.NodeIDs { + if nodeID == cm.nodeID { + isMember = true + break + } + } + + if !isMember { + return nil + } + + // Wait until shutdown time + waitDuration := time.Until(shutdown.ShutdownTime) + if waitDuration > 0 { + cm.logger.Debug("Waiting for shutdown time", + zap.String("database", shutdown.DatabaseName), + zap.Duration("wait", waitDuration)) + time.Sleep(waitDuration) + } + + // Stop the instance + cm.mu.Lock() + instance, exists := cm.activeClusters[shutdown.DatabaseName] + if exists { + cm.logger.Info("Stopping database instance for hibernation", + zap.String("database", shutdown.DatabaseName)) + + if err := instance.Stop(); err != nil { + cm.logger.Error("Failed to stop instance", zap.Error(err)) + cm.mu.Unlock() + return err + } + + // Free ports + ports := PortPair{HTTPPort: instance.HTTPPort, RaftPort: instance.RaftPort} + cm.portManager.ReleasePortPair(ports) + + // Remove from active clusters + delete(cm.activeClusters, shutdown.DatabaseName) + } + cm.mu.Unlock() + + // Update metadata status to hibernating + dbMeta.Status = StatusHibernating + dbMeta.LastAccessed = time.Now() + cm.metadataStore.SetDatabase(dbMeta) + + // Broadcast status update + cm.broadcastStatusUpdate(shutdown.DatabaseName, StatusHibernating) + + cm.logger.Info("Database hibernated successfully", + zap.String("database", shutdown.DatabaseName)) + + return nil +} + +// handleWakeupRequest processes wake-up requests for hibernating databases +func (cm *ClusterManager) handleWakeupRequest(msg *MetadataMessage) error { + var wakeup DatabaseWakeupRequest + if err := msg.UnmarshalPayload(&wakeup); err != nil { + return err + } + + cm.logger.Info("Received wakeup request", + zap.String("database", wakeup.DatabaseName), + zap.String("requester", wakeup.RequesterNodeID)) + + // Get database metadata + dbMeta := cm.metadataStore.GetDatabase(wakeup.DatabaseName) + if dbMeta == nil { + cm.logger.Warn("Wakeup request for unknown database", + zap.String("database", wakeup.DatabaseName)) + return nil + } + + // Check if database is hibernating + if dbMeta.Status != StatusHibernating { + cm.logger.Debug("Database not hibernating, ignoring wakeup", + zap.String("database", wakeup.DatabaseName), + zap.String("status", string(dbMeta.Status))) + return nil + } + + // Check if this node is a member + isMember := false + for _, nodeID := range dbMeta.NodeIDs { + if nodeID == cm.nodeID { + isMember = true + break + } + } + + if !isMember { + return nil + } + + // Update status to waking + dbMeta.Status = StatusWaking + dbMeta.LastAccessed = time.Now() + cm.metadataStore.SetDatabase(dbMeta) + + // Start the instance + go cm.wakeupDatabase(wakeup.DatabaseName, dbMeta) + + return nil +} + +// wakeupDatabase starts a hibernating database +func (cm *ClusterManager) wakeupDatabase(dbName string, dbMeta *DatabaseMetadata) { + cm.logger.Info("Waking up database", zap.String("database", dbName)) + + // Get port mapping for this node + ports, exists := dbMeta.PortMappings[cm.nodeID] + if !exists { + cm.logger.Error("No port mapping found for node", + zap.String("database", dbName), + zap.String("node", cm.nodeID)) + return + } + + // Try to allocate the same ports (or new ones if taken) + allocatedPorts := ports + if cm.portManager.IsPortAllocated(ports.HTTPPort) || cm.portManager.IsPortAllocated(ports.RaftPort) { + cm.logger.Warn("Original ports taken, allocating new ones", + zap.String("database", dbName)) + newPorts, err := cm.portManager.AllocatePortPair(dbName) + if err != nil { + cm.logger.Error("Failed to allocate ports for wakeup", zap.Error(err)) + return + } + allocatedPorts = newPorts + // Update port mapping in metadata + dbMeta.PortMappings[cm.nodeID] = allocatedPorts + cm.metadataStore.SetDatabase(dbMeta) + } else { + // Mark ports as allocated + if err := cm.portManager.AllocateSpecificPorts(dbName, ports); err != nil { + cm.logger.Error("Failed to allocate specific ports", zap.Error(err)) + return + } + } + + // Determine join address (first node in the list) + joinAddr := "" + if len(dbMeta.NodeIDs) > 0 && dbMeta.NodeIDs[0] != cm.nodeID { + firstNodePorts := dbMeta.PortMappings[dbMeta.NodeIDs[0]] + joinAddr = fmt.Sprintf("http://%s:%d", cm.getAdvertiseAddress(), firstNodePorts.RaftPort) + } + + // Create and start instance + instance := NewRQLiteInstance( + dbName, + allocatedPorts, + cm.dataDir, + cm.getAdvertiseAddress(), + cm.getAdvertiseAddress(), + cm.logger, + ) + + // Determine if this is the leader (first node) + isLeader := len(dbMeta.NodeIDs) > 0 && dbMeta.NodeIDs[0] == cm.nodeID + + if err := instance.Start(cm.ctx, isLeader, joinAddr); err != nil { + cm.logger.Error("Failed to start instance during wakeup", zap.Error(err)) + cm.portManager.ReleasePortPair(allocatedPorts) + return + } + + // Add to active clusters + cm.mu.Lock() + cm.activeClusters[dbName] = instance + cm.mu.Unlock() + + // Update metadata status to active + dbMeta.Status = StatusActive + dbMeta.LastAccessed = time.Now() + cm.metadataStore.SetDatabase(dbMeta) + + // Broadcast status update + cm.broadcastStatusUpdate(dbName, StatusActive) + + cm.logger.Info("Database woke up successfully", zap.String("database", dbName)) +} + +// handleNodeReplacementNeeded processes requests to replace a failed node +func (cm *ClusterManager) handleNodeReplacementNeeded(msg *MetadataMessage) error { + var replacement NodeReplacementNeeded + if err := msg.UnmarshalPayload(&replacement); err != nil { + return err + } + + cm.logger.Info("Received node replacement needed", + zap.String("database", replacement.DatabaseName), + zap.String("failed_node", replacement.FailedNodeID)) + + // Get database metadata + dbMeta := cm.metadataStore.GetDatabase(replacement.DatabaseName) + if dbMeta == nil { + cm.logger.Warn("Replacement needed for unknown database", + zap.String("database", replacement.DatabaseName)) + return nil + } + + // Check if we're eligible to replace (not at capacity and healthy) + nodeCapacity := cm.metadataStore.GetNode(cm.nodeID) + if nodeCapacity == nil || nodeCapacity.CurrentDatabases >= nodeCapacity.MaxDatabases { + cm.logger.Debug("Not eligible for replacement - at capacity", + zap.String("database", replacement.DatabaseName)) + return nil + } + + // Check if we're not already a member + for _, nodeID := range dbMeta.NodeIDs { + if nodeID == cm.nodeID { + cm.logger.Debug("Already a member of this database", + zap.String("database", replacement.DatabaseName)) + return nil + } + } + + // Allocate ports for potential replacement + ports, err := cm.portManager.AllocatePortPair(replacement.DatabaseName) + if err != nil { + cm.logger.Warn("Cannot allocate ports for replacement", + zap.String("database", replacement.DatabaseName), + zap.Error(err)) + return nil + } + + // Send replacement offer + response := NodeReplacementOffer{ + DatabaseName: replacement.DatabaseName, + NodeID: cm.nodeID, + AvailablePorts: ports, + } + + msgData, err := MarshalMetadataMessage(MsgNodeReplacementOffer, cm.nodeID, response) + if err != nil { + cm.portManager.ReleasePortPair(ports) + return fmt.Errorf("failed to marshal replacement offer: %w", err) + } + + topic := "/debros/metadata/v1" + if err := cm.pubsubAdapter.Publish(cm.ctx, topic, msgData); err != nil { + cm.portManager.ReleasePortPair(ports) + return fmt.Errorf("failed to publish replacement offer: %w", err) + } + + cm.logger.Info("Sent replacement offer", + zap.String("database", replacement.DatabaseName)) + + return nil +} + +// handleNodeReplacementOffer processes offers from nodes to replace a failed node +func (cm *ClusterManager) handleNodeReplacementOffer(msg *MetadataMessage) error { + var offer NodeReplacementOffer + if err := msg.UnmarshalPayload(&offer); err != nil { + return err + } + + cm.logger.Debug("Received replacement offer", + zap.String("database", offer.DatabaseName), + zap.String("from_node", offer.NodeID)) + + // This would be handled by the coordinator who initiated the replacement request + // For now, we'll implement a simple first-come-first-served approach + // In production, this would involve collecting offers and selecting the best node + + dbMeta := cm.metadataStore.GetDatabase(offer.DatabaseName) + if dbMeta == nil { + return nil + } + + // Check if we're a surviving member and should coordinate + isMember := false + for _, nodeID := range dbMeta.NodeIDs { + if nodeID == cm.nodeID { + isMember = true + break + } + } + + if !isMember { + return nil + } + + // Simple approach: accept first offer + // In production: collect offers, select based on capacity/health + cm.logger.Info("Accepting replacement offer", + zap.String("database", offer.DatabaseName), + zap.String("new_node", offer.NodeID)) + + // Find a surviving node to provide join address + var joinAddr string + for _, nodeID := range dbMeta.NodeIDs { + if nodeID != cm.nodeID { + continue // Skip failed nodes (would need proper tracking) + } + ports := dbMeta.PortMappings[nodeID] + joinAddr = fmt.Sprintf("http://%s:%d", cm.getAdvertiseAddress(), ports.RaftPort) + break + } + + // Broadcast confirmation + confirm := NodeReplacementConfirm{ + DatabaseName: offer.DatabaseName, + NewNodeID: offer.NodeID, + ReplacedNodeID: "", // Would track which node failed + NewNodePorts: offer.AvailablePorts, + JoinAddress: joinAddr, + } + + msgData, err := MarshalMetadataMessage(MsgNodeReplacementConfirm, cm.nodeID, confirm) + if err != nil { + return fmt.Errorf("failed to marshal replacement confirm: %w", err) + } + + topic := "/debros/metadata/v1" + if err := cm.pubsubAdapter.Publish(cm.ctx, topic, msgData); err != nil { + return fmt.Errorf("failed to publish replacement confirm: %w", err) + } + + return nil +} + +// handleNodeReplacementConfirm processes confirmation of a replacement node +func (cm *ClusterManager) handleNodeReplacementConfirm(msg *MetadataMessage) error { + var confirm NodeReplacementConfirm + if err := msg.UnmarshalPayload(&confirm); err != nil { + return err + } + + cm.logger.Info("Received node replacement confirm", + zap.String("database", confirm.DatabaseName), + zap.String("new_node", confirm.NewNodeID), + zap.String("replaced_node", confirm.ReplacedNodeID)) + + // Get database metadata + dbMeta := cm.metadataStore.GetDatabase(confirm.DatabaseName) + if dbMeta == nil { + cm.logger.Warn("Replacement confirm for unknown database", + zap.String("database", confirm.DatabaseName)) + return nil + } + + // Update metadata: replace old node with new node + newNodes := make([]string, 0, len(dbMeta.NodeIDs)) + for _, nodeID := range dbMeta.NodeIDs { + if nodeID == confirm.ReplacedNodeID { + newNodes = append(newNodes, confirm.NewNodeID) + } else { + newNodes = append(newNodes, nodeID) + } + } + dbMeta.NodeIDs = newNodes + + // Update port mappings + delete(dbMeta.PortMappings, confirm.ReplacedNodeID) + dbMeta.PortMappings[confirm.NewNodeID] = confirm.NewNodePorts + + cm.metadataStore.SetDatabase(dbMeta) + + // If we're the new node, start the instance and join + if confirm.NewNodeID == cm.nodeID { + cm.logger.Info("Starting as replacement node", + zap.String("database", confirm.DatabaseName)) + + go cm.startReplacementInstance(confirm.DatabaseName, confirm.NewNodePorts, confirm.JoinAddress) + } + + return nil +} + +// startReplacementInstance starts an instance as a replacement for a failed node +func (cm *ClusterManager) startReplacementInstance(dbName string, ports PortPair, joinAddr string) { + cm.logger.Info("Starting replacement instance", + zap.String("database", dbName), + zap.String("join_address", joinAddr)) + + // Create instance + instance := NewRQLiteInstance( + dbName, + ports, + cm.dataDir, + cm.getAdvertiseAddress(), + cm.getAdvertiseAddress(), + cm.logger, + ) + + // Start with join address (always joining existing cluster) + if err := instance.Start(cm.ctx, false, joinAddr); err != nil { + cm.logger.Error("Failed to start replacement instance", zap.Error(err)) + cm.portManager.ReleasePortPair(ports) + return + } + + // Add to active clusters + cm.mu.Lock() + cm.activeClusters[dbName] = instance + cm.mu.Unlock() + + // Broadcast active status + cm.broadcastStatusUpdate(dbName, StatusActive) + + cm.logger.Info("Replacement instance started successfully", + zap.String("database", dbName)) +} diff --git a/pkg/rqlite/cluster_manager.go b/pkg/rqlite/cluster_manager.go new file mode 100644 index 0000000..2feafec --- /dev/null +++ b/pkg/rqlite/cluster_manager.go @@ -0,0 +1,519 @@ +package rqlite + +import ( + "context" + "fmt" + "os" + "path/filepath" + "sync" + "time" + + "github.com/DeBrosOfficial/network/pkg/config" + "github.com/DeBrosOfficial/network/pkg/pubsub" + "go.uber.org/zap" +) + +// ClusterManager manages multiple RQLite database clusters on a single node +type ClusterManager struct { + nodeID string + config *config.DatabaseConfig + discoveryConfig *config.DiscoveryConfig + dataDir string + logger *zap.Logger + + metadataStore *MetadataStore + activeClusters map[string]*RQLiteInstance // dbName -> instance + portManager *PortManager + pubsubAdapter *pubsub.ClientAdapter + coordinatorRegistry *CoordinatorRegistry + + mu sync.RWMutex + ctx context.Context + cancel context.CancelFunc +} + +// NewClusterManager creates a new cluster manager +func NewClusterManager( + nodeID string, + cfg *config.DatabaseConfig, + discoveryCfg *config.DiscoveryConfig, + dataDir string, + pubsubAdapter *pubsub.ClientAdapter, + logger *zap.Logger, +) *ClusterManager { + ctx, cancel := context.WithCancel(context.Background()) + + // Initialize port manager + portManager := NewPortManager( + PortRange{Start: cfg.PortRangeHTTPStart, End: cfg.PortRangeHTTPEnd}, + PortRange{Start: cfg.PortRangeRaftStart, End: cfg.PortRangeRaftEnd}, + ) + + return &ClusterManager{ + nodeID: nodeID, + config: cfg, + discoveryConfig: discoveryCfg, + dataDir: dataDir, + logger: logger, + metadataStore: NewMetadataStore(), + activeClusters: make(map[string]*RQLiteInstance), + portManager: portManager, + pubsubAdapter: pubsubAdapter, + coordinatorRegistry: NewCoordinatorRegistry(), + ctx: ctx, + cancel: cancel, + } +} + +// Start starts the cluster manager +func (cm *ClusterManager) Start() error { + cm.logger.Info("Starting cluster manager", + zap.String("node_id", cm.nodeID), + zap.Int("max_databases", cm.config.MaxDatabases)) + + // Subscribe to metadata topic + metadataTopic := "/debros/metadata/v1" + if err := cm.pubsubAdapter.Subscribe(cm.ctx, metadataTopic, cm.handleMetadataMessage); err != nil { + return fmt.Errorf("failed to subscribe to metadata topic: %w", err) + } + + // Announce node capacity + go cm.announceCapacityPeriodically() + + // Start health monitoring + go cm.monitorHealth() + + // Start idle detection for hibernation + if cm.config.HibernationTimeout > 0 { + go cm.monitorIdleDatabases() + } + + // Perform startup reconciliation + go cm.reconcileOrphanedData() + + cm.logger.Info("Cluster manager started successfully") + return nil +} + +// Stop stops the cluster manager +func (cm *ClusterManager) Stop() error { + cm.logger.Info("Stopping cluster manager") + + cm.cancel() + + // Stop all active clusters + cm.mu.Lock() + defer cm.mu.Unlock() + + for dbName, instance := range cm.activeClusters { + cm.logger.Info("Stopping database instance", + zap.String("database", dbName)) + if err := instance.Stop(); err != nil { + cm.logger.Warn("Error stopping database instance", + zap.String("database", dbName), + zap.Error(err)) + } + } + + cm.logger.Info("Cluster manager stopped") + return nil +} + +// handleMetadataMessage processes incoming metadata messages +func (cm *ClusterManager) handleMetadataMessage(topic string, data []byte) error { + msg, err := UnmarshalMetadataMessage(data) + if err != nil { + // Silently ignore non-metadata messages (other pubsub traffic) + cm.logger.Debug("Ignoring non-metadata message on metadata topic", zap.Error(err)) + return nil + } + + // Skip messages from self + if msg.NodeID == cm.nodeID { + return nil + } + + cm.logger.Debug("Received metadata message", + zap.String("type", string(msg.Type)), + zap.String("from", msg.NodeID)) + + switch msg.Type { + case MsgDatabaseCreateRequest: + return cm.handleCreateRequest(msg) + case MsgDatabaseCreateResponse: + return cm.handleCreateResponse(msg) + case MsgDatabaseCreateConfirm: + return cm.handleCreateConfirm(msg) + case MsgDatabaseStatusUpdate: + return cm.handleStatusUpdate(msg) + case MsgNodeCapacityAnnouncement: + return cm.handleCapacityAnnouncement(msg) + case MsgNodeHealthPing: + return cm.handleHealthPing(msg) + case MsgDatabaseIdleNotification: + return cm.handleIdleNotification(msg) + case MsgDatabaseShutdownCoordinated: + return cm.handleShutdownCoordinated(msg) + case MsgDatabaseWakeupRequest: + return cm.handleWakeupRequest(msg) + case MsgNodeReplacementNeeded: + return cm.handleNodeReplacementNeeded(msg) + case MsgNodeReplacementOffer: + return cm.handleNodeReplacementOffer(msg) + case MsgNodeReplacementConfirm: + return cm.handleNodeReplacementConfirm(msg) + case MsgMetadataSync: + return cm.handleMetadataSync(msg) + case MsgMetadataChecksumReq: + return cm.handleChecksumRequest(msg) + case MsgMetadataChecksumRes: + return cm.handleChecksumResponse(msg) + default: + cm.logger.Debug("Unhandled message type", zap.String("type", string(msg.Type))) + } + + return nil +} + +// CreateDatabase creates a new database cluster +func (cm *ClusterManager) CreateDatabase(dbName string, replicationFactor int) error { + cm.logger.Info("Initiating database creation", + zap.String("database", dbName), + zap.Int("replication_factor", replicationFactor)) + + // Check if database already exists + if existing := cm.metadataStore.GetDatabase(dbName); existing != nil { + return fmt.Errorf("database %s already exists", dbName) + } + + // Create coordinator for this database creation + coordinator := NewCreateCoordinator(dbName, replicationFactor, cm.nodeID, cm.logger) + cm.coordinatorRegistry.Register(coordinator) + defer cm.coordinatorRegistry.Remove(dbName) + + // Broadcast create request + req := DatabaseCreateRequest{ + DatabaseName: dbName, + RequesterNodeID: cm.nodeID, + ReplicationFactor: replicationFactor, + } + + msgData, err := MarshalMetadataMessage(MsgDatabaseCreateRequest, cm.nodeID, req) + if err != nil { + return fmt.Errorf("failed to marshal create request: %w", err) + } + + topic := "/debros/metadata/v1" + if err := cm.pubsubAdapter.Publish(cm.ctx, topic, msgData); err != nil { + return fmt.Errorf("failed to publish create request: %w", err) + } + + cm.logger.Info("Database create request broadcasted, waiting for responses", + zap.String("database", dbName)) + + // Wait for responses (2 seconds timeout) + waitCtx, cancel := context.WithTimeout(cm.ctx, 2*time.Second) + defer cancel() + + if err := coordinator.WaitForResponses(waitCtx, 2*time.Second); err != nil { + cm.logger.Warn("Timeout waiting for responses", zap.String("database", dbName), zap.Error(err)) + } + + // Select nodes + responses := coordinator.GetResponses() + if len(responses) < replicationFactor { + return fmt.Errorf("insufficient nodes responded: got %d, need %d", len(responses), replicationFactor) + } + + selectedResponses := coordinator.SelectNodes() + cm.logger.Info("Selected nodes for database", + zap.String("database", dbName), + zap.Int("count", len(selectedResponses))) + + // Determine if this node is the coordinator (lowest ID among responders) + allNodeIDs := make([]string, len(selectedResponses)) + for i, resp := range selectedResponses { + allNodeIDs[i] = resp.NodeID + } + coordinatorID := SelectCoordinator(allNodeIDs) + isCoordinator := coordinatorID == cm.nodeID + + if isCoordinator { + cm.logger.Info("This node is coordinator, broadcasting confirmation", + zap.String("database", dbName)) + + // Build node assignments + assignments := make([]NodeAssignment, len(selectedResponses)) + for i, resp := range selectedResponses { + role := "follower" + if i == 0 { + role = "leader" + } + assignments[i] = NodeAssignment{ + NodeID: resp.NodeID, + HTTPPort: resp.AvailablePorts.HTTPPort, + RaftPort: resp.AvailablePorts.RaftPort, + Role: role, + } + } + + // Broadcast confirmation + confirm := DatabaseCreateConfirm{ + DatabaseName: dbName, + SelectedNodes: assignments, + CoordinatorNodeID: cm.nodeID, + } + + msgData, err := MarshalMetadataMessage(MsgDatabaseCreateConfirm, cm.nodeID, confirm) + if err != nil { + return fmt.Errorf("failed to marshal create confirm: %w", err) + } + + if err := cm.pubsubAdapter.Publish(cm.ctx, topic, msgData); err != nil { + return fmt.Errorf("failed to publish create confirm: %w", err) + } + + cm.logger.Info("Database creation confirmation broadcasted", + zap.String("database", dbName)) + } + + return nil +} + +// GetDatabase returns the RQLite instance for a database +func (cm *ClusterManager) GetDatabase(dbName string) *RQLiteInstance { + cm.mu.RLock() + defer cm.mu.RUnlock() + return cm.activeClusters[dbName] +} + +// ListDatabases returns all active database names +func (cm *ClusterManager) ListDatabases() []string { + cm.mu.RLock() + defer cm.mu.RUnlock() + + names := make([]string, 0, len(cm.activeClusters)) + for name := range cm.activeClusters { + names = append(names, name) + } + return names +} + +// GetMetadataStore returns the metadata store +func (cm *ClusterManager) GetMetadataStore() *MetadataStore { + return cm.metadataStore +} + +// announceCapacityPeriodically announces node capacity every 30 seconds +func (cm *ClusterManager) announceCapacityPeriodically() { + ticker := time.NewTicker(30 * time.Second) + defer ticker.Stop() + + // Announce immediately + cm.announceCapacity() + + for { + select { + case <-cm.ctx.Done(): + return + case <-ticker.C: + cm.announceCapacity() + } + } +} + +// announceCapacity announces this node's capacity +func (cm *ClusterManager) announceCapacity() { + cm.mu.RLock() + currentDatabases := len(cm.activeClusters) + cm.mu.RUnlock() + + announcement := NodeCapacityAnnouncement{ + NodeID: cm.nodeID, + MaxDatabases: cm.config.MaxDatabases, + CurrentDatabases: currentDatabases, + PortRangeHTTP: PortRange{Start: cm.config.PortRangeHTTPStart, End: cm.config.PortRangeHTTPEnd}, + PortRangeRaft: PortRange{Start: cm.config.PortRangeRaftStart, End: cm.config.PortRangeRaftEnd}, + } + + msgData, err := MarshalMetadataMessage(MsgNodeCapacityAnnouncement, cm.nodeID, announcement) + if err != nil { + cm.logger.Warn("Failed to marshal capacity announcement", zap.Error(err)) + return + } + + topic := "/debros/metadata/v1" + if err := cm.pubsubAdapter.Publish(cm.ctx, topic, msgData); err != nil { + cm.logger.Warn("Failed to publish capacity announcement", zap.Error(err)) + return + } + + // Update local metadata store + capacity := &NodeCapacity{ + NodeID: cm.nodeID, + MaxDatabases: cm.config.MaxDatabases, + CurrentDatabases: currentDatabases, + PortRangeHTTP: announcement.PortRangeHTTP, + PortRangeRaft: announcement.PortRangeRaft, + LastHealthCheck: time.Now(), + IsHealthy: true, + } + cm.metadataStore.SetNode(capacity) +} + +// monitorHealth monitors the health of active databases +func (cm *ClusterManager) monitorHealth() { + ticker := time.NewTicker(cm.discoveryConfig.HealthCheckInterval) + defer ticker.Stop() + + for { + select { + case <-cm.ctx.Done(): + return + case <-ticker.C: + cm.checkDatabaseHealth() + } + } +} + +// checkDatabaseHealth checks if all active databases are healthy +func (cm *ClusterManager) checkDatabaseHealth() { + cm.mu.RLock() + defer cm.mu.RUnlock() + + for dbName, instance := range cm.activeClusters { + if !instance.IsRunning() { + cm.logger.Warn("Database instance is not running", + zap.String("database", dbName)) + // TODO: Implement recovery logic + } + } +} + +// monitorIdleDatabases monitors for idle databases to hibernate +func (cm *ClusterManager) monitorIdleDatabases() { + ticker := time.NewTicker(10 * time.Second) + defer ticker.Stop() + + for { + select { + case <-cm.ctx.Done(): + return + case <-ticker.C: + cm.detectIdleDatabases() + } + } +} + +// detectIdleDatabases detects idle databases and broadcasts idle notifications +func (cm *ClusterManager) detectIdleDatabases() { + cm.mu.RLock() + defer cm.mu.RUnlock() + + for dbName, instance := range cm.activeClusters { + if instance.IsIdle(cm.config.HibernationTimeout) && instance.Status == StatusActive { + cm.logger.Debug("Database is idle", + zap.String("database", dbName), + zap.Duration("idle_time", time.Since(instance.LastQuery))) + + // Broadcast idle notification + notification := DatabaseIdleNotification{ + DatabaseName: dbName, + NodeID: cm.nodeID, + LastActivity: instance.LastQuery, + } + + msgData, err := MarshalMetadataMessage(MsgDatabaseIdleNotification, cm.nodeID, notification) + if err != nil { + cm.logger.Warn("Failed to marshal idle notification", zap.Error(err)) + continue + } + + topic := "/debros/metadata/v1" + if err := cm.pubsubAdapter.Publish(cm.ctx, topic, msgData); err != nil { + cm.logger.Warn("Failed to publish idle notification", zap.Error(err)) + } + } + } +} + +// reconcileOrphanedData checks for orphaned database directories +func (cm *ClusterManager) reconcileOrphanedData() { + // Wait a bit for metadata to sync + time.Sleep(10 * time.Second) + + cm.logger.Info("Starting orphaned data reconciliation") + + // Read data directory + entries, err := os.ReadDir(cm.dataDir) + if err != nil { + cm.logger.Error("Failed to read data directory for reconciliation", zap.Error(err)) + return + } + + orphanCount := 0 + for _, entry := range entries { + if !entry.IsDir() { + continue + } + + dbName := entry.Name() + + // Skip special directories + if dbName == "rqlite" || dbName == "." || dbName == ".." { + continue + } + + // Check if this database exists in metadata + dbMeta := cm.metadataStore.GetDatabase(dbName) + if dbMeta == nil { + // Orphaned directory - no metadata exists + cm.logger.Warn("Found orphaned database directory", + zap.String("database", dbName)) + orphanCount++ + + // Delete the orphaned directory + dbPath := filepath.Join(cm.dataDir, dbName) + if err := os.RemoveAll(dbPath); err != nil { + cm.logger.Error("Failed to remove orphaned directory", + zap.String("database", dbName), + zap.String("path", dbPath), + zap.Error(err)) + } else { + cm.logger.Info("Removed orphaned database directory", + zap.String("database", dbName)) + } + continue + } + + // Check if this node is a member of the database + isMember := false + for _, nodeID := range dbMeta.NodeIDs { + if nodeID == cm.nodeID { + isMember = true + break + } + } + + if !isMember { + // This node is not a member - orphaned data + cm.logger.Warn("Found database directory for non-member database", + zap.String("database", dbName)) + orphanCount++ + + dbPath := filepath.Join(cm.dataDir, dbName) + if err := os.RemoveAll(dbPath); err != nil { + cm.logger.Error("Failed to remove non-member directory", + zap.String("database", dbName), + zap.String("path", dbPath), + zap.Error(err)) + } else { + cm.logger.Info("Removed non-member database directory", + zap.String("database", dbName)) + } + } + } + + cm.logger.Info("Orphaned data reconciliation complete", + zap.Int("orphans_found", orphanCount)) +} diff --git a/pkg/rqlite/consensus.go b/pkg/rqlite/consensus.go new file mode 100644 index 0000000..0f507a3 --- /dev/null +++ b/pkg/rqlite/consensus.go @@ -0,0 +1,180 @@ +package rqlite + +import ( + "crypto/sha256" + "encoding/hex" + "encoding/json" + "sort" + "time" +) + +// SelectCoordinator deterministically selects a coordinator from a list of node IDs +// Uses lexicographic ordering (lowest ID wins) +func SelectCoordinator(nodeIDs []string) string { + if len(nodeIDs) == 0 { + return "" + } + + sorted := make([]string, len(nodeIDs)) + copy(sorted, nodeIDs) + sort.Strings(sorted) + + return sorted[0] +} + +// ResolveConflict resolves a conflict between two database metadata entries +// Returns the winning metadata entry +func ResolveConflict(local, remote *DatabaseMetadata) *DatabaseMetadata { + if local == nil { + return remote + } + if remote == nil { + return local + } + + // Compare vector clocks + localVC := VectorClock(local.VectorClock) + remoteVC := VectorClock(remote.VectorClock) + + comparison := localVC.Compare(remoteVC) + + if comparison == -1 { + // Local happens before remote, remote wins + return remote + } else if comparison == 1 { + // Remote happens before local, local wins + return local + } + + // Concurrent: use version number as tiebreaker + if remote.Version > local.Version { + return remote + } else if local.Version > remote.Version { + return local + } + + // Same version: use timestamp as tiebreaker + if remote.CreatedAt.After(local.CreatedAt) { + return remote + } else if local.CreatedAt.After(remote.CreatedAt) { + return local + } + + // Same timestamp: use lexicographic comparison of database name + if remote.DatabaseName < local.DatabaseName { + return remote + } + + return local +} + +// MetadataChecksum represents a checksum of database metadata +type MetadataChecksum struct { + DatabaseName string `json:"database_name"` + Version uint64 `json:"version"` + Hash string `json:"hash"` +} + +// ComputeMetadataChecksum computes a checksum for database metadata +func ComputeMetadataChecksum(db *DatabaseMetadata) MetadataChecksum { + if db == nil { + return MetadataChecksum{} + } + + // Create a canonical representation for hashing + canonical := struct { + DatabaseName string + NodeIDs []string + PortMappings map[string]PortPair + Status DatabaseStatus + }{ + DatabaseName: db.DatabaseName, + NodeIDs: make([]string, len(db.NodeIDs)), + PortMappings: db.PortMappings, + Status: db.Status, + } + + // Sort node IDs for deterministic hashing + copy(canonical.NodeIDs, db.NodeIDs) + sort.Strings(canonical.NodeIDs) + + // Serialize and hash + data, _ := json.Marshal(canonical) + hash := sha256.Sum256(data) + + return MetadataChecksum{ + DatabaseName: db.DatabaseName, + Version: db.Version, + Hash: hex.EncodeToString(hash[:]), + } +} + +// ComputeFullStateChecksum computes checksums for all databases in the store +func ComputeFullStateChecksum(store *MetadataStore) []MetadataChecksum { + checksums := make([]MetadataChecksum, 0) + + for _, name := range store.ListDatabases() { + if db := store.GetDatabase(name); db != nil { + checksums = append(checksums, ComputeMetadataChecksum(db)) + } + } + + // Sort by database name for deterministic ordering + sort.Slice(checksums, func(i, j int) bool { + return checksums[i].DatabaseName < checksums[j].DatabaseName + }) + + return checksums +} + +// SelectNodesForDatabase selects N nodes from the list of healthy nodes +// Returns up to replicationFactor nodes +func SelectNodesForDatabase(healthyNodes []string, replicationFactor int) []string { + if len(healthyNodes) == 0 { + return []string{} + } + + // Sort for deterministic selection + sorted := make([]string, len(healthyNodes)) + copy(sorted, healthyNodes) + sort.Strings(sorted) + + // Select first N nodes + n := replicationFactor + if n > len(sorted) { + n = len(sorted) + } + + return sorted[:n] +} + +// IsNodeInCluster checks if a node is part of a database cluster +func IsNodeInCluster(nodeID string, db *DatabaseMetadata) bool { + if db == nil { + return false + } + + for _, id := range db.NodeIDs { + if id == nodeID { + return true + } + } + return false +} + +// UpdateDatabaseMetadata updates metadata with vector clock and version increment +func UpdateDatabaseMetadata(db *DatabaseMetadata, nodeID string) { + if db.VectorClock == nil { + db.VectorClock = NewVectorClock() + } + + // Increment vector clock for this node + vc := VectorClock(db.VectorClock) + vc.Increment(nodeID) + + // Increment version + db.Version++ + + // Update last accessed time + db.LastAccessed = time.Now() +} diff --git a/pkg/rqlite/coordinator.go b/pkg/rqlite/coordinator.go new file mode 100644 index 0000000..aa9c3e6 --- /dev/null +++ b/pkg/rqlite/coordinator.go @@ -0,0 +1,145 @@ +package rqlite + +import ( + "context" + "fmt" + "sort" + "sync" + "time" + + "go.uber.org/zap" +) + +// CreateCoordinator coordinates the database creation process +type CreateCoordinator struct { + dbName string + replicationFactor int + requesterID string + responses []DatabaseCreateResponse + mu sync.Mutex + logger *zap.Logger +} + +// NewCreateCoordinator creates a new coordinator for database creation +func NewCreateCoordinator(dbName string, replicationFactor int, requesterID string, logger *zap.Logger) *CreateCoordinator { + return &CreateCoordinator{ + dbName: dbName, + replicationFactor: replicationFactor, + requesterID: requesterID, + responses: make([]DatabaseCreateResponse, 0), + logger: logger, + } +} + +// AddResponse adds a response from a node +func (cc *CreateCoordinator) AddResponse(response DatabaseCreateResponse) { + cc.mu.Lock() + defer cc.mu.Unlock() + cc.responses = append(cc.responses, response) +} + +// GetResponses returns all collected responses +func (cc *CreateCoordinator) GetResponses() []DatabaseCreateResponse { + cc.mu.Lock() + defer cc.mu.Unlock() + return append([]DatabaseCreateResponse(nil), cc.responses...) +} + +// ResponseCount returns the number of responses received +func (cc *CreateCoordinator) ResponseCount() int { + cc.mu.Lock() + defer cc.mu.Unlock() + return len(cc.responses) +} + +// SelectNodes selects the best nodes for the database cluster +func (cc *CreateCoordinator) SelectNodes() []DatabaseCreateResponse { + cc.mu.Lock() + defer cc.mu.Unlock() + + if len(cc.responses) < cc.replicationFactor { + cc.logger.Warn("Insufficient responses for database creation", + zap.String("database", cc.dbName), + zap.Int("required", cc.replicationFactor), + zap.Int("received", len(cc.responses))) + // Return what we have if less than required + return cc.responses + } + + // Sort responses by node ID for deterministic selection + sorted := make([]DatabaseCreateResponse, len(cc.responses)) + copy(sorted, cc.responses) + sort.Slice(sorted, func(i, j int) bool { + return sorted[i].NodeID < sorted[j].NodeID + }) + + // Select first N nodes + return sorted[:cc.replicationFactor] +} + +// WaitForResponses waits for responses with a timeout +func (cc *CreateCoordinator) WaitForResponses(ctx context.Context, timeout time.Duration) error { + deadline := time.Now().Add(timeout) + + ticker := time.NewTicker(100 * time.Millisecond) + defer ticker.Stop() + + for { + select { + case <-ctx.Done(): + return ctx.Err() + case <-ticker.C: + if time.Now().After(deadline) { + return fmt.Errorf("timeout waiting for responses") + } + if cc.ResponseCount() >= cc.replicationFactor { + return nil + } + } + } +} + +// CoordinatorRegistry manages active coordinators for database creation +type CoordinatorRegistry struct { + coordinators map[string]*CreateCoordinator // dbName -> coordinator + mu sync.RWMutex +} + +// NewCoordinatorRegistry creates a new coordinator registry +func NewCoordinatorRegistry() *CoordinatorRegistry { + return &CoordinatorRegistry{ + coordinators: make(map[string]*CreateCoordinator), + } +} + +// Register registers a new coordinator +func (cr *CoordinatorRegistry) Register(coordinator *CreateCoordinator) { + cr.mu.Lock() + defer cr.mu.Unlock() + cr.coordinators[coordinator.dbName] = coordinator +} + +// Get retrieves a coordinator by database name +func (cr *CoordinatorRegistry) Get(dbName string) *CreateCoordinator { + cr.mu.RLock() + defer cr.mu.RUnlock() + return cr.coordinators[dbName] +} + +// Remove removes a coordinator +func (cr *CoordinatorRegistry) Remove(dbName string) { + cr.mu.Lock() + defer cr.mu.Unlock() + delete(cr.coordinators, dbName) +} + +// HandleCreateResponse handles a CREATE_RESPONSE message +func (cr *CoordinatorRegistry) HandleCreateResponse(response DatabaseCreateResponse) { + cr.mu.RLock() + coordinator := cr.coordinators[response.DatabaseName] + cr.mu.RUnlock() + + if coordinator != nil { + coordinator.AddResponse(response) + } +} diff --git a/pkg/rqlite/gateway.go b/pkg/rqlite/gateway.go deleted file mode 100644 index 369b2e1..0000000 --- a/pkg/rqlite/gateway.go +++ /dev/null @@ -1,615 +0,0 @@ -package rqlite - -// HTTP gateway for the rqlite ORM client. -// -// This file exposes a minimal, SDK-friendly HTTP interface over the ORM-like -// client defined in client.go. It maps high-level operations (Query, Exec, -// FindBy, FindOneBy, QueryBuilder-based SELECTs, Transactions) and a few schema -// helpers into JSON-over-HTTP endpoints that can be called from any language. -// -// Endpoints (under BasePath, default: /v1/db): -// - POST {base}/query -> arbitrary SELECT; returns rows as []map[string]any -// - POST {base}/exec -> write statement (INSERT/UPDATE/DELETE/DDL); returns {rows_affected,last_insert_id} -// - POST {base}/find -> FindBy(table, criteria, opts...) -> returns []map -// - POST {base}/find-one -> FindOneBy(table, criteria, opts...) -> returns map -// - POST {base}/select -> Fluent SELECT builder via JSON (joins, where, order, group, limit, offset); returns []map or one map if one=true -// - POST {base}/transaction -> Execute a sequence of exec/query ops atomically; optionally return results -// -// Schema helpers (convenience; powered via Exec/Query): -// - GET {base}/schema -> list of user tables/views and create SQL -// - POST {base}/create-table -> {schema: "CREATE TABLE ..."} -> status ok -// - POST {base}/drop-table -> {table: "name"} -> status ok (safe-validated identifier) -// -// Notes: -// - All numbers in JSON are decoded as float64 by default; we best-effort coerce -// integral values to int64 for SQL placeholders. -// - The Save/Remove reflection helpers in the ORM require concrete Go structs; -// exposing them generically over HTTP is not portable. Prefer using the Exec -// and Find APIs, or the Select builder for CRUD-like flows. - -import ( - "context" - "database/sql" - "encoding/json" - "errors" - "fmt" - "net/http" - "regexp" - "strings" - "time" -) - -// HTTPGateway exposes the ORM Client as a set of HTTP handlers. -type HTTPGateway struct { - // Client is the ORM-like rqlite client to execute operations against. - Client Client - // BasePath is the prefix for all routes, e.g. "/v1/db". - // If empty, defaults to "/v1/db". A trailing slash is trimmed. - BasePath string - - // Optional: Request timeout. If > 0, handlers will use a context with this timeout. - Timeout time.Duration -} - -// NewHTTPGateway constructs a new HTTPGateway with sensible defaults. -func NewHTTPGateway(c Client, base string) *HTTPGateway { - return &HTTPGateway{ - Client: c, - BasePath: base, - } -} - -// RegisterRoutes registers all handlers onto the provided mux under BasePath. -func (g *HTTPGateway) RegisterRoutes(mux *http.ServeMux) { - base := g.base() - mux.HandleFunc(base+"/query", g.handleQuery) - mux.HandleFunc(base+"/exec", g.handleExec) - mux.HandleFunc(base+"/find", g.handleFind) - mux.HandleFunc(base+"/find-one", g.handleFindOne) - mux.HandleFunc(base+"/select", g.handleSelect) - // Keep "transaction" for compatibility with existing routes. - mux.HandleFunc(base+"/transaction", g.handleTransaction) - - // Schema helpers - mux.HandleFunc(base+"/schema", g.handleSchema) - mux.HandleFunc(base+"/create-table", g.handleCreateTable) - mux.HandleFunc(base+"/drop-table", g.handleDropTable) -} - -func (g *HTTPGateway) base() string { - b := strings.TrimSpace(g.BasePath) - if b == "" { - b = "/v1/db" - } - if b != "/" { - b = strings.TrimRight(b, "/") - } - return b -} - -func (g *HTTPGateway) withTimeout(ctx context.Context) (context.Context, context.CancelFunc) { - if g.Timeout > 0 { - return context.WithTimeout(ctx, g.Timeout) - } - return context.WithCancel(ctx) -} - -// -------------------- -// Common HTTP helpers -// -------------------- - -func writeJSON(w http.ResponseWriter, code int, v any) { - w.Header().Set("Content-Type", "application/json") - w.WriteHeader(code) - _ = json.NewEncoder(w).Encode(v) -} - -func writeError(w http.ResponseWriter, code int, msg string) { - writeJSON(w, code, map[string]any{"error": msg}) -} - -func onlyMethod(w http.ResponseWriter, r *http.Request, method string) bool { - if r.Method != method { - writeError(w, http.StatusMethodNotAllowed, "method not allowed") - return false - } - return true -} - -// Normalize JSON-decoded args for SQL placeholders. -// - Convert float64 with integral value to int64 to better match SQLite expectations. -// - Leave strings, bools and nulls as-is. -// - Recursively normalizes nested arrays if present. -func normalizeArgs(args []any) []any { - out := make([]any, len(args)) - for i, a := range args { - switch v := a.(type) { - case float64: - // If v is integral (within epsilon), convert to int64 - if v == float64(int64(v)) { - out[i] = int64(v) - } else { - out[i] = v - } - case []any: - out[i] = normalizeArgs(v) - default: - out[i] = a - } - } - return out -} - -// -------------------- -// Request DTOs -// -------------------- - -type queryRequest struct { - SQL string `json:"sql"` - Args []any `json:"args"` -} - -type execRequest struct { - SQL string `json:"sql"` - Args []any `json:"args"` -} - -type findOptions struct { - Select []string `json:"select"` - OrderBy []string `json:"order_by"` - GroupBy []string `json:"group_by"` - Limit *int `json:"limit"` - Offset *int `json:"offset"` - Joins []joinBody `json:"joins"` -} - -type findRequest struct { - Table string `json:"table"` - Criteria map[string]any `json:"criteria"` - Options findOptions `json:"options"` - // Back-compat: allow options at top-level too - Select []string `json:"select"` - OrderBy []string `json:"order_by"` - GroupBy []string `json:"group_by"` - Limit *int `json:"limit"` - Offset *int `json:"offset"` - Joins []joinBody `json:"joins"` -} - -type findOneRequest = findRequest - -type joinBody struct { - Kind string `json:"kind"` // "INNER" | "LEFT" | "JOIN" - Table string `json:"table"` // table name - On string `json:"on"` // join condition -} - -type whereBody struct { - Conj string `json:"conj"` // "AND" | "OR" (default AND) - Expr string `json:"expr"` // e.g., "a = ? AND b > ?" - Args []any `json:"args"` -} - -type selectRequest struct { - Table string `json:"table"` - Alias string `json:"alias"` - Select []string `json:"select"` - Joins []joinBody `json:"joins"` - Where []whereBody `json:"where"` - GroupBy []string `json:"group_by"` - OrderBy []string `json:"order_by"` - Limit *int `json:"limit"` - Offset *int `json:"offset"` - One bool `json:"one"` // if true, returns a single row (object) -} - -type txOp struct { - Kind string `json:"kind"` // "exec" | "query" - SQL string `json:"sql"` - Args []any `json:"args"` -} - -type transactionRequest struct { - Ops []txOp `json:"ops"` - ReturnResults bool `json:"return_results"` // if true, returns per-op results - StopOnError bool `json:"stop_on_error"` // default true in tx - PartialResults bool `json:"partial_results"` // ignored for actual TX (atomic); kept for API symmetry -} - -// -------------------- -// Handlers -// -------------------- - -func (g *HTTPGateway) handleQuery(w http.ResponseWriter, r *http.Request) { - if !onlyMethod(w, r, http.MethodPost) { - return - } - if g.Client == nil { - writeError(w, http.StatusServiceUnavailable, "client not initialized") - return - } - var body queryRequest - if err := json.NewDecoder(r.Body).Decode(&body); err != nil || strings.TrimSpace(body.SQL) == "" { - writeError(w, http.StatusBadRequest, "invalid body: {sql, args?}") - return - } - args := normalizeArgs(body.Args) - ctx, cancel := g.withTimeout(r.Context()) - defer cancel() - - out := make([]map[string]any, 0, 16) - if err := g.Client.Query(ctx, &out, body.SQL, args...); err != nil { - writeError(w, http.StatusInternalServerError, err.Error()) - return - } - writeJSON(w, http.StatusOK, map[string]any{ - "items": out, - "count": len(out), - }) -} - -func (g *HTTPGateway) handleExec(w http.ResponseWriter, r *http.Request) { - if !onlyMethod(w, r, http.MethodPost) { - return - } - if g.Client == nil { - writeError(w, http.StatusServiceUnavailable, "client not initialized") - return - } - var body execRequest - if err := json.NewDecoder(r.Body).Decode(&body); err != nil || strings.TrimSpace(body.SQL) == "" { - writeError(w, http.StatusBadRequest, "invalid body: {sql, args?}") - return - } - args := normalizeArgs(body.Args) - ctx, cancel := g.withTimeout(r.Context()) - defer cancel() - - res, err := g.Client.Exec(ctx, body.SQL, args...) - if err != nil { - writeError(w, http.StatusInternalServerError, err.Error()) - return - } - liid, _ := res.LastInsertId() - ra, _ := res.RowsAffected() - writeJSON(w, http.StatusOK, map[string]any{ - "rows_affected": ra, - "last_insert_id": liid, - "execution_state": "ok", - }) -} - -func (g *HTTPGateway) handleFind(w http.ResponseWriter, r *http.Request) { - if !onlyMethod(w, r, http.MethodPost) { - return - } - if g.Client == nil { - writeError(w, http.StatusServiceUnavailable, "client not initialized") - return - } - var body findRequest - if err := json.NewDecoder(r.Body).Decode(&body); err != nil || strings.TrimSpace(body.Table) == "" { - writeError(w, http.StatusBadRequest, "invalid body: {table, criteria, options?}") - return - } - opts := makeFindOptions(mergeFindOptions(body)) - ctx, cancel := g.withTimeout(r.Context()) - defer cancel() - - out := make([]map[string]any, 0, 32) - if err := g.Client.FindBy(ctx, &out, body.Table, body.Criteria, opts...); err != nil { - writeError(w, http.StatusInternalServerError, err.Error()) - return - } - writeJSON(w, http.StatusOK, map[string]any{ - "items": out, - "count": len(out), - }) -} - -func (g *HTTPGateway) handleFindOne(w http.ResponseWriter, r *http.Request) { - if !onlyMethod(w, r, http.MethodPost) { - return - } - if g.Client == nil { - writeError(w, http.StatusServiceUnavailable, "client not initialized") - return - } - var body findOneRequest - if err := json.NewDecoder(r.Body).Decode(&body); err != nil || strings.TrimSpace(body.Table) == "" { - writeError(w, http.StatusBadRequest, "invalid body: {table, criteria, options?}") - return - } - opts := makeFindOptions(mergeFindOptions(body)) - ctx, cancel := g.withTimeout(r.Context()) - defer cancel() - - row := make(map[string]any) - if err := g.Client.FindOneBy(ctx, &row, body.Table, body.Criteria, opts...); err != nil { - if errors.Is(err, sql.ErrNoRows) { - writeError(w, http.StatusNotFound, "not found") - return - } - writeError(w, http.StatusInternalServerError, err.Error()) - return - } - writeJSON(w, http.StatusOK, row) -} - -func (g *HTTPGateway) handleSelect(w http.ResponseWriter, r *http.Request) { - if !onlyMethod(w, r, http.MethodPost) { - return - } - if g.Client == nil { - writeError(w, http.StatusServiceUnavailable, "client not initialized") - return - } - var body selectRequest - if err := json.NewDecoder(r.Body).Decode(&body); err != nil || strings.TrimSpace(body.Table) == "" { - writeError(w, http.StatusBadRequest, "invalid body: {table, select?, where?, joins?, order_by?, group_by?, limit?, offset?, one?}") - return - } - ctx, cancel := g.withTimeout(r.Context()) - defer cancel() - - qb := g.Client.CreateQueryBuilder(body.Table) - if alias := strings.TrimSpace(body.Alias); alias != "" { - qb = qb.Alias(alias) - } - if len(body.Select) > 0 { - qb = qb.Select(body.Select...) - } - // joins - for _, j := range body.Joins { - switch strings.ToUpper(strings.TrimSpace(j.Kind)) { - case "INNER": - qb = qb.InnerJoin(j.Table, j.On) - case "LEFT": - qb = qb.LeftJoin(j.Table, j.On) - default: - qb = qb.Join(j.Table, j.On) - } - } - // where - for _, wcl := range body.Where { - switch strings.ToUpper(strings.TrimSpace(wcl.Conj)) { - case "OR": - qb = qb.OrWhere(wcl.Expr, normalizeArgs(wcl.Args)...) - default: - qb = qb.AndWhere(wcl.Expr, normalizeArgs(wcl.Args)...) - } - } - // group/order/limit/offset - if len(body.GroupBy) > 0 { - qb = qb.GroupBy(body.GroupBy...) - } - if len(body.OrderBy) > 0 { - qb = qb.OrderBy(body.OrderBy...) - } - if body.Limit != nil { - qb = qb.Limit(*body.Limit) - } - if body.Offset != nil { - qb = qb.Offset(*body.Offset) - } - - if body.One { - row := make(map[string]any) - if err := qb.GetOne(ctx, &row); err != nil { - if errors.Is(err, sql.ErrNoRows) { - writeError(w, http.StatusNotFound, "not found") - return - } - writeError(w, http.StatusInternalServerError, err.Error()) - return - } - writeJSON(w, http.StatusOK, row) - return - } - - rows := make([]map[string]any, 0, 32) - if err := qb.GetMany(ctx, &rows); err != nil { - writeError(w, http.StatusInternalServerError, err.Error()) - return - } - writeJSON(w, http.StatusOK, map[string]any{ - "items": rows, - "count": len(rows), - }) -} - -func (g *HTTPGateway) handleTransaction(w http.ResponseWriter, r *http.Request) { - if !onlyMethod(w, r, http.MethodPost) { - return - } - if g.Client == nil { - writeError(w, http.StatusServiceUnavailable, "client not initialized") - return - } - var body transactionRequest - if err := json.NewDecoder(r.Body).Decode(&body); err != nil || len(body.Ops) == 0 { - writeError(w, http.StatusBadRequest, "invalid body: {ops:[{kind,sql,args?}], return_results?}") - return - } - ctx, cancel := g.withTimeout(r.Context()) - defer cancel() - - results := make([]any, 0, len(body.Ops)) - err := g.Client.Tx(ctx, func(tx Tx) error { - for _, op := range body.Ops { - switch strings.ToLower(strings.TrimSpace(op.Kind)) { - case "exec": - res, err := tx.Exec(ctx, op.SQL, normalizeArgs(op.Args)...) - if err != nil { - return err - } - if body.ReturnResults { - li, _ := res.LastInsertId() - ra, _ := res.RowsAffected() - results = append(results, map[string]any{ - "rows_affected": ra, - "last_insert_id": li, - }) - } - case "query": - var rows []map[string]any - if err := tx.Query(ctx, &rows, op.SQL, normalizeArgs(op.Args)...); err != nil { - return err - } - if body.ReturnResults { - results = append(results, rows) - } - default: - return fmt.Errorf("invalid op kind: %s", op.Kind) - } - } - return nil - }) - if err != nil { - writeError(w, http.StatusInternalServerError, err.Error()) - return - } - if body.ReturnResults { - writeJSON(w, http.StatusOK, map[string]any{ - "status": "ok", - "results": results, - }) - return - } - writeJSON(w, http.StatusOK, map[string]any{"status": "ok"}) -} - -// -------------------- -// Schema helpers -// -------------------- - -func (g *HTTPGateway) handleSchema(w http.ResponseWriter, r *http.Request) { - if !onlyMethod(w, r, http.MethodGet) { - return - } - if g.Client == nil { - writeError(w, http.StatusServiceUnavailable, "client not initialized") - return - } - ctx, cancel := g.withTimeout(r.Context()) - defer cancel() - - sqlText := `SELECT name, type, sql FROM sqlite_master WHERE type IN ('table','view') AND name NOT LIKE 'sqlite_%' ORDER BY name` - var rows []map[string]any - if err := g.Client.Query(ctx, &rows, sqlText); err != nil { - writeError(w, http.StatusInternalServerError, err.Error()) - return - } - writeJSON(w, http.StatusOK, map[string]any{ - "objects": rows, - "count": len(rows), - }) -} - -func (g *HTTPGateway) handleCreateTable(w http.ResponseWriter, r *http.Request) { - if !onlyMethod(w, r, http.MethodPost) { - return - } - if g.Client == nil { - writeError(w, http.StatusServiceUnavailable, "client not initialized") - return - } - var body struct { - Schema string `json:"schema"` - } - if err := json.NewDecoder(r.Body).Decode(&body); err != nil || strings.TrimSpace(body.Schema) == "" { - writeError(w, http.StatusBadRequest, "invalid body: {schema}") - return - } - ctx, cancel := g.withTimeout(r.Context()) - defer cancel() - - if _, err := g.Client.Exec(ctx, body.Schema); err != nil { - writeError(w, http.StatusInternalServerError, err.Error()) - return - } - writeJSON(w, http.StatusCreated, map[string]any{"status": "ok"}) -} - -var identRe = regexp.MustCompile(`^[A-Za-z_][A-Za-z0-9_]*$`) - -func (g *HTTPGateway) handleDropTable(w http.ResponseWriter, r *http.Request) { - if !onlyMethod(w, r, http.MethodPost) { - return - } - if g.Client == nil { - writeError(w, http.StatusServiceUnavailable, "client not initialized") - return - } - var body struct { - Table string `json:"table"` - } - if err := json.NewDecoder(r.Body).Decode(&body); err != nil || strings.TrimSpace(body.Table) == "" { - writeError(w, http.StatusBadRequest, "invalid body: {table}") - return - } - tbl := strings.TrimSpace(body.Table) - if !identRe.MatchString(tbl) { - writeError(w, http.StatusBadRequest, "invalid table identifier") - return - } - ctx, cancel := g.withTimeout(r.Context()) - defer cancel() - - stmt := "DROP TABLE IF EXISTS " + tbl - if _, err := g.Client.Exec(ctx, stmt); err != nil { - writeError(w, http.StatusInternalServerError, err.Error()) - return - } - writeJSON(w, http.StatusOK, map[string]any{"status": "ok"}) -} - -// -------------------- -// Helpers -// -------------------- - -func mergeFindOptions(fr findRequest) findOptions { - // Prefer nested Options; fallback to top-level legacy fields - if (len(fr.Options.Select)+len(fr.Options.OrderBy)+len(fr.Options.GroupBy)) > 0 || - fr.Options.Limit != nil || fr.Options.Offset != nil || len(fr.Options.Joins) > 0 { - return fr.Options - } - return findOptions{ - Select: fr.Select, - OrderBy: fr.OrderBy, - GroupBy: fr.GroupBy, - Limit: fr.Limit, - Offset: fr.Offset, - Joins: fr.Joins, - } -} - -func makeFindOptions(o findOptions) []FindOption { - opts := make([]FindOption, 0, 6) - if len(o.OrderBy) > 0 { - opts = append(opts, WithOrderBy(o.OrderBy...)) - } - if len(o.GroupBy) > 0 { - opts = append(opts, WithGroupBy(o.GroupBy...)) - } - if o.Limit != nil { - opts = append(opts, WithLimit(*o.Limit)) - } - if o.Offset != nil { - opts = append(opts, WithOffset(*o.Offset)) - } - if len(o.Select) > 0 { - opts = append(opts, WithSelect(o.Select...)) - } - for _, j := range o.Joins { - opts = append(opts, WithJoin(justOrDefault(strings.ToUpper(j.Kind), "JOIN"), j.Table, j.On)) - } - return opts -} - -func justOrDefault(s, def string) string { - if strings.TrimSpace(s) == "" { - return def - } - return s -} diff --git a/pkg/rqlite/instance.go b/pkg/rqlite/instance.go new file mode 100644 index 0000000..e043d3e --- /dev/null +++ b/pkg/rqlite/instance.go @@ -0,0 +1,240 @@ +package rqlite + +import ( + "context" + "errors" + "fmt" + "net/http" + "os" + "os/exec" + "path/filepath" + "syscall" + "time" + + "github.com/rqlite/gorqlite" + "go.uber.org/zap" +) + +// RQLiteInstance represents a single rqlite database instance +type RQLiteInstance struct { + DatabaseName string + HTTPPort int + RaftPort int + DataDir string + AdvHTTPAddr string // Advertised HTTP address + AdvRaftAddr string // Advertised Raft address + Cmd *exec.Cmd + Connection *gorqlite.Connection + LastQuery time.Time + Status DatabaseStatus + logger *zap.Logger +} + +// NewRQLiteInstance creates a new RQLite instance configuration +func NewRQLiteInstance(dbName string, ports PortPair, dataDir string, advHTTPAddr, advRaftAddr string, logger *zap.Logger) *RQLiteInstance { + return &RQLiteInstance{ + DatabaseName: dbName, + HTTPPort: ports.HTTPPort, + RaftPort: ports.RaftPort, + DataDir: filepath.Join(dataDir, dbName, "rqlite"), + AdvHTTPAddr: advHTTPAddr, + AdvRaftAddr: advRaftAddr, + Status: StatusInitializing, + logger: logger, + } +} + +// Start starts the rqlite subprocess +func (ri *RQLiteInstance) Start(ctx context.Context, isLeader bool, joinAddr string) error { + // Create data directory + if err := os.MkdirAll(ri.DataDir, 0755); err != nil { + return fmt.Errorf("failed to create data directory: %w", err) + } + + // Build rqlited command + args := []string{ + "-http-addr", fmt.Sprintf("0.0.0.0:%d", ri.HTTPPort), + "-raft-addr", fmt.Sprintf("0.0.0.0:%d", ri.RaftPort), + } + + // Add advertised addresses if provided + if ri.AdvHTTPAddr != "" { + args = append(args, "-http-adv-addr", ri.AdvHTTPAddr) + } + if ri.AdvRaftAddr != "" { + args = append(args, "-raft-adv-addr", ri.AdvRaftAddr) + } + + // Add join address if this is a follower + if !isLeader && joinAddr != "" { + args = append(args, "-join", joinAddr) + } + + // Add data directory as positional argument + args = append(args, ri.DataDir) + + ri.logger.Info("Starting RQLite instance", + zap.String("database", ri.DatabaseName), + zap.Int("http_port", ri.HTTPPort), + zap.Int("raft_port", ri.RaftPort), + zap.String("data_dir", ri.DataDir), + zap.Bool("is_leader", isLeader), + zap.Strings("args", args)) + + // Start RQLite process + ri.Cmd = exec.Command("rqlited", args...) + + // Optionally capture stdout/stderr for debugging + // ri.Cmd.Stdout = os.Stdout + // ri.Cmd.Stderr = os.Stderr + + if err := ri.Cmd.Start(); err != nil { + return fmt.Errorf("failed to start rqlited: %w", err) + } + + // Wait for RQLite to be ready + if err := ri.waitForReady(ctx); err != nil { + ri.Stop() + return fmt.Errorf("rqlited failed to become ready: %w", err) + } + + // Create connection + conn, err := gorqlite.Open(fmt.Sprintf("http://localhost:%d", ri.HTTPPort)) + if err != nil { + ri.Stop() + return fmt.Errorf("failed to connect to rqlited: %w", err) + } + ri.Connection = conn + + // Wait for SQL availability + if err := ri.waitForSQLAvailable(ctx); err != nil { + ri.Stop() + return fmt.Errorf("rqlited SQL not available: %w", err) + } + + ri.Status = StatusActive + ri.LastQuery = time.Now() + + ri.logger.Info("RQLite instance started successfully", + zap.String("database", ri.DatabaseName)) + + return nil +} + +// Stop stops the rqlite subprocess gracefully +func (ri *RQLiteInstance) Stop() error { + if ri.Connection != nil { + ri.Connection.Close() + ri.Connection = nil + } + + if ri.Cmd == nil || ri.Cmd.Process == nil { + return nil + } + + ri.logger.Info("Stopping RQLite instance", + zap.String("database", ri.DatabaseName)) + + // Try SIGTERM first + if err := ri.Cmd.Process.Signal(syscall.SIGTERM); err != nil { + // Fallback to Kill if signaling fails + _ = ri.Cmd.Process.Kill() + return nil + } + + // Wait up to 5 seconds for graceful shutdown + done := make(chan error, 1) + go func() { done <- ri.Cmd.Wait() }() + + select { + case err := <-done: + if err != nil && !errors.Is(err, os.ErrClosed) { + ri.logger.Warn("RQLite process exited with error", + zap.String("database", ri.DatabaseName), + zap.Error(err)) + } + case <-time.After(5 * time.Second): + ri.logger.Warn("RQLite did not exit in time; killing", + zap.String("database", ri.DatabaseName)) + _ = ri.Cmd.Process.Kill() + } + + ri.Status = StatusHibernating + return nil +} + +// waitForReady waits for RQLite HTTP endpoint to be ready +func (ri *RQLiteInstance) waitForReady(ctx context.Context) error { + url := fmt.Sprintf("http://localhost:%d/status", ri.HTTPPort) + client := &http.Client{Timeout: 2 * time.Second} + + for i := 0; i < 30; i++ { + select { + case <-ctx.Done(): + return ctx.Err() + default: + } + + resp, err := client.Get(url) + if err == nil { + resp.Body.Close() + if resp.StatusCode == http.StatusOK { + return nil + } + } + + time.Sleep(1 * time.Second) + } + + return fmt.Errorf("rqlited did not become ready within timeout") +} + +// waitForSQLAvailable waits until SQL queries can be executed +func (ri *RQLiteInstance) waitForSQLAvailable(ctx context.Context) error { + if ri.Connection == nil { + return errors.New("no rqlite connection") + } + + ticker := time.NewTicker(1 * time.Second) + defer ticker.Stop() + + for i := 0; i < 30; i++ { + select { + case <-ctx.Done(): + return ctx.Err() + case <-ticker.C: + _, err := ri.Connection.QueryOne("SELECT 1") + if err == nil { + return nil + } + if i%5 == 0 { + ri.logger.Debug("Waiting for RQLite SQL availability", + zap.String("database", ri.DatabaseName), + zap.Error(err)) + } + } + } + + return fmt.Errorf("rqlited SQL not available within timeout") +} + +// UpdateLastQuery updates the last query timestamp +func (ri *RQLiteInstance) UpdateLastQuery() { + ri.LastQuery = time.Now() +} + +// IsIdle checks if the instance has been idle for the given duration +func (ri *RQLiteInstance) IsIdle(timeout time.Duration) bool { + return time.Since(ri.LastQuery) > timeout +} + +// IsRunning checks if the rqlite process is running +func (ri *RQLiteInstance) IsRunning() bool { + if ri.Cmd == nil || ri.Cmd.Process == nil { + return false + } + + // Check if process is still alive + err := ri.Cmd.Process.Signal(syscall.Signal(0)) + return err == nil +} diff --git a/pkg/rqlite/metadata.go b/pkg/rqlite/metadata.go new file mode 100644 index 0000000..ae03fe8 --- /dev/null +++ b/pkg/rqlite/metadata.go @@ -0,0 +1,153 @@ +package rqlite + +import ( + "sync" + "time" +) + +// DatabaseStatus represents the state of a database cluster +type DatabaseStatus string + +const ( + StatusInitializing DatabaseStatus = "initializing" + StatusActive DatabaseStatus = "active" + StatusHibernating DatabaseStatus = "hibernating" + StatusWaking DatabaseStatus = "waking" +) + +// PortPair represents HTTP and Raft ports for a database instance +type PortPair struct { + HTTPPort int `json:"http_port"` + RaftPort int `json:"raft_port"` +} + +// DatabaseMetadata contains metadata for a single database cluster +type DatabaseMetadata struct { + DatabaseName string `json:"database_name"` // e.g., "my_app_exampledb_1" + NodeIDs []string `json:"node_ids"` // Peer IDs hosting this database + PortMappings map[string]PortPair `json:"port_mappings"` // nodeID -> {HTTP port, Raft port} + Status DatabaseStatus `json:"status"` // Current status + CreatedAt time.Time `json:"created_at"` + LastAccessed time.Time `json:"last_accessed"` + LeaderNodeID string `json:"leader_node_id"` // Which node is rqlite leader + Version uint64 `json:"version"` // For conflict resolution + VectorClock map[string]uint64 `json:"vector_clock"` // For distributed consensus +} + +// NodeCapacity represents capacity information for a node +type NodeCapacity struct { + NodeID string `json:"node_id"` + MaxDatabases int `json:"max_databases"` // Configured limit + CurrentDatabases int `json:"current_databases"` // How many currently active + PortRangeHTTP PortRange `json:"port_range_http"` + PortRangeRaft PortRange `json:"port_range_raft"` + LastHealthCheck time.Time `json:"last_health_check"` + IsHealthy bool `json:"is_healthy"` +} + +// PortRange represents a range of available ports +type PortRange struct { + Start int `json:"start"` + End int `json:"end"` +} + +// MetadataStore is an in-memory store for database metadata +type MetadataStore struct { + databases map[string]*DatabaseMetadata // key = database name + nodes map[string]*NodeCapacity // key = node ID + mu sync.RWMutex +} + +// NewMetadataStore creates a new metadata store +func NewMetadataStore() *MetadataStore { + return &MetadataStore{ + databases: make(map[string]*DatabaseMetadata), + nodes: make(map[string]*NodeCapacity), + } +} + +// GetDatabase retrieves metadata for a database +func (ms *MetadataStore) GetDatabase(name string) *DatabaseMetadata { + ms.mu.RLock() + defer ms.mu.RUnlock() + if db, exists := ms.databases[name]; exists { + // Return a copy to prevent external modification + dbCopy := *db + return &dbCopy + } + return nil +} + +// SetDatabase stores or updates metadata for a database +func (ms *MetadataStore) SetDatabase(db *DatabaseMetadata) { + ms.mu.Lock() + defer ms.mu.Unlock() + ms.databases[db.DatabaseName] = db +} + +// DeleteDatabase removes metadata for a database +func (ms *MetadataStore) DeleteDatabase(name string) { + ms.mu.Lock() + defer ms.mu.Unlock() + delete(ms.databases, name) +} + +// ListDatabases returns all database names +func (ms *MetadataStore) ListDatabases() []string { + ms.mu.RLock() + defer ms.mu.RUnlock() + names := make([]string, 0, len(ms.databases)) + for name := range ms.databases { + names = append(names, name) + } + return names +} + +// GetNode retrieves capacity info for a node +func (ms *MetadataStore) GetNode(nodeID string) *NodeCapacity { + ms.mu.RLock() + defer ms.mu.RUnlock() + if node, exists := ms.nodes[nodeID]; exists { + nodeCopy := *node + return &nodeCopy + } + return nil +} + +// SetNode stores or updates capacity info for a node +func (ms *MetadataStore) SetNode(node *NodeCapacity) { + ms.mu.Lock() + defer ms.mu.Unlock() + ms.nodes[node.NodeID] = node +} + +// DeleteNode removes capacity info for a node +func (ms *MetadataStore) DeleteNode(nodeID string) { + ms.mu.Lock() + defer ms.mu.Unlock() + delete(ms.nodes, nodeID) +} + +// ListNodes returns all node IDs +func (ms *MetadataStore) ListNodes() []string { + ms.mu.RLock() + defer ms.mu.RUnlock() + ids := make([]string, 0, len(ms.nodes)) + for id := range ms.nodes { + ids = append(ids, id) + } + return ids +} + +// GetHealthyNodes returns IDs of healthy nodes +func (ms *MetadataStore) GetHealthyNodes() []string { + ms.mu.RLock() + defer ms.mu.RUnlock() + healthy := make([]string, 0) + for id, node := range ms.nodes { + if node.IsHealthy && node.CurrentDatabases < node.MaxDatabases { + healthy = append(healthy, id) + } + } + return healthy +} diff --git a/pkg/rqlite/migrations.go b/pkg/rqlite/migrations.go deleted file mode 100644 index 60efc9b..0000000 --- a/pkg/rqlite/migrations.go +++ /dev/null @@ -1,442 +0,0 @@ -package rqlite - -import ( - "context" - "database/sql" - "fmt" - "io/fs" - "os" - "path/filepath" - "sort" - "strconv" - "strings" - "unicode" - - _ "github.com/rqlite/gorqlite/stdlib" - "go.uber.org/zap" -) - -// ApplyMigrations scans a directory for *.sql files, orders them by numeric prefix, -// and applies any that are not yet recorded in schema_migrations(version). -func ApplyMigrations(ctx context.Context, db *sql.DB, dir string, logger *zap.Logger) error { - if logger == nil { - logger = zap.NewNop() - } - - if err := ensureMigrationsTable(ctx, db); err != nil { - return fmt.Errorf("ensure schema_migrations: %w", err) - } - - files, err := readMigrationFiles(dir) - if err != nil { - return fmt.Errorf("read migration files: %w", err) - } - if len(files) == 0 { - logger.Info("No migrations found", zap.String("dir", dir)) - return nil - } - - applied, err := loadAppliedVersions(ctx, db) - if err != nil { - return fmt.Errorf("load applied versions: %w", err) - } - - for _, mf := range files { - if applied[mf.Version] { - logger.Info("Migration already applied; skipping", zap.Int("version", mf.Version), zap.String("name", mf.Name)) - continue - } - - sqlBytes, err := os.ReadFile(mf.Path) - if err != nil { - return fmt.Errorf("read migration %s: %w", mf.Path, err) - } - - logger.Info("Applying migration", zap.Int("version", mf.Version), zap.String("name", mf.Name)) - if err := applySQL(ctx, db, string(sqlBytes)); err != nil { - return fmt.Errorf("apply migration %d (%s): %w", mf.Version, mf.Name, err) - } - - if _, err := db.ExecContext(ctx, `INSERT OR IGNORE INTO schema_migrations(version) VALUES (?)`, mf.Version); err != nil { - return fmt.Errorf("record migration %d: %w", mf.Version, err) - } - logger.Info("Migration applied", zap.Int("version", mf.Version), zap.String("name", mf.Name)) - } - - return nil -} - -// ApplyMigrationsDirs applies migrations from multiple directories. -// - Gathers *.sql files from each dir -// - Parses numeric prefix as the version -// - Errors if the same version appears in more than one dir (to avoid ambiguity) -// - Sorts globally by version and applies those not yet in schema_migrations -func ApplyMigrationsDirs(ctx context.Context, db *sql.DB, dirs []string, logger *zap.Logger) error { - if logger == nil { - logger = zap.NewNop() - } - if err := ensureMigrationsTable(ctx, db); err != nil { - return fmt.Errorf("ensure schema_migrations: %w", err) - } - - files, err := readMigrationFilesFromDirs(dirs) - if err != nil { - return err - } - if len(files) == 0 { - logger.Info("No migrations found in provided directories", zap.Strings("dirs", dirs)) - return nil - } - - applied, err := loadAppliedVersions(ctx, db) - if err != nil { - return fmt.Errorf("load applied versions: %w", err) - } - - for _, mf := range files { - if applied[mf.Version] { - logger.Info("Migration already applied; skipping", zap.Int("version", mf.Version), zap.String("name", mf.Name), zap.String("path", mf.Path)) - continue - } - sqlBytes, err := os.ReadFile(mf.Path) - if err != nil { - return fmt.Errorf("read migration %s: %w", mf.Path, err) - } - - logger.Info("Applying migration", zap.Int("version", mf.Version), zap.String("name", mf.Name), zap.String("path", mf.Path)) - if err := applySQL(ctx, db, string(sqlBytes)); err != nil { - return fmt.Errorf("apply migration %d (%s): %w", mf.Version, mf.Name, err) - } - - if _, err := db.ExecContext(ctx, `INSERT OR IGNORE INTO schema_migrations(version) VALUES (?)`, mf.Version); err != nil { - return fmt.Errorf("record migration %d: %w", mf.Version, err) - } - logger.Info("Migration applied", zap.Int("version", mf.Version), zap.String("name", mf.Name)) - } - - return nil -} - -// ApplyMigrationsFromManager is a convenience helper bound to RQLiteManager. -func (r *RQLiteManager) ApplyMigrations(ctx context.Context, dir string) error { - db, err := sql.Open("rqlite", fmt.Sprintf("http://localhost:%d", r.config.RQLitePort)) - if err != nil { - return fmt.Errorf("open rqlite db: %w", err) - } - defer db.Close() - - return ApplyMigrations(ctx, db, dir, r.logger) -} - -// ApplyMigrationsDirs is the multi-dir variant on RQLiteManager. -func (r *RQLiteManager) ApplyMigrationsDirs(ctx context.Context, dirs []string) error { - db, err := sql.Open("rqlite", fmt.Sprintf("http://localhost:%d", r.config.RQLitePort)) - if err != nil { - return fmt.Errorf("open rqlite db: %w", err) - } - defer db.Close() - - return ApplyMigrationsDirs(ctx, db, dirs, r.logger) -} - -func ensureMigrationsTable(ctx context.Context, db *sql.DB) error { - _, err := db.ExecContext(ctx, ` -CREATE TABLE IF NOT EXISTS schema_migrations ( - version INTEGER PRIMARY KEY, - applied_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP -)`) - return err -} - -type migrationFile struct { - Version int - Name string - Path string -} - -func readMigrationFiles(dir string) ([]migrationFile, error) { - entries, err := os.ReadDir(dir) - if err != nil { - if os.IsNotExist(err) { - return []migrationFile{}, nil - } - return nil, err - } - - var out []migrationFile - for _, e := range entries { - if e.IsDir() { - continue - } - name := e.Name() - if !strings.HasSuffix(strings.ToLower(name), ".sql") { - continue - } - ver, ok := parseVersionPrefix(name) - if !ok { - continue - } - out = append(out, migrationFile{ - Version: ver, - Name: name, - Path: filepath.Join(dir, name), - }) - } - sort.Slice(out, func(i, j int) bool { return out[i].Version < out[j].Version }) - return out, nil -} - -func readMigrationFilesFromDirs(dirs []string) ([]migrationFile, error) { - all := make([]migrationFile, 0, 64) - seen := map[int]string{} // version -> path (for duplicate detection) - - for _, d := range dirs { - files, err := readMigrationFiles(d) - if err != nil { - return nil, fmt.Errorf("reading dir %s: %w", d, err) - } - for _, f := range files { - if prev, dup := seen[f.Version]; dup { - return nil, fmt.Errorf("duplicate migration version %d detected in %s and %s; ensure global version uniqueness", f.Version, prev, f.Path) - } - seen[f.Version] = f.Path - all = append(all, f) - } - } - sort.Slice(all, func(i, j int) bool { return all[i].Version < all[j].Version }) - return all, nil -} - -func parseVersionPrefix(name string) (int, bool) { - // Expect formats like "001_initial.sql", "2_add_table.sql", etc. - i := 0 - for i < len(name) && unicode.IsDigit(rune(name[i])) { - i++ - } - if i == 0 { - return 0, false - } - ver, err := strconv.Atoi(name[:i]) - if err != nil { - return 0, false - } - return ver, true -} - -func loadAppliedVersions(ctx context.Context, db *sql.DB) (map[int]bool, error) { - rows, err := db.QueryContext(ctx, `SELECT version FROM schema_migrations`) - if err != nil { - // If the table doesn't exist yet (very first run), ensure it and return empty set. - if isNoSuchTable(err) { - if err := ensureMigrationsTable(ctx, db); err != nil { - return nil, err - } - return map[int]bool{}, nil - } - return nil, err - } - defer rows.Close() - - applied := make(map[int]bool) - for rows.Next() { - var v int - if err := rows.Scan(&v); err != nil { - return nil, err - } - applied[v] = true - } - return applied, rows.Err() -} - -func isNoSuchTable(err error) bool { - // rqlite/sqlite error messages vary; keep it permissive - msg := strings.ToLower(err.Error()) - return strings.Contains(msg, "no such table") || strings.Contains(msg, "does not exist") -} - -// applySQL splits the script into individual statements, strips explicit -// transaction control (BEGIN/COMMIT/ROLLBACK/END), and executes statements -// sequentially to avoid nested transaction issues with rqlite. -func applySQL(ctx context.Context, db *sql.DB, script string) error { - s := strings.TrimSpace(script) - if s == "" { - return nil - } - stmts := splitSQLStatements(s) - stmts = filterOutTxnControls(stmts) - - for _, stmt := range stmts { - if strings.TrimSpace(stmt) == "" { - continue - } - if _, err := db.ExecContext(ctx, stmt); err != nil { - return fmt.Errorf("exec stmt failed: %w (stmt: %s)", err, snippet(stmt)) - } - } - return nil -} - -func containsToken(stmts []string, token string) bool { - for _, s := range stmts { - if strings.EqualFold(strings.TrimSpace(s), token) { - return true - } - } - return false -} - -// removed duplicate helper - -// removed duplicate helper - -// isTxnControl returns true if the statement is a transaction control command. -func isTxnControl(s string) bool { - t := strings.ToUpper(strings.TrimSpace(s)) - switch t { - case "BEGIN", "BEGIN TRANSACTION", "COMMIT", "END", "ROLLBACK": - return true - default: - return false - } -} - -// filterOutTxnControls removes BEGIN/COMMIT/ROLLBACK/END statements. -func filterOutTxnControls(stmts []string) []string { - out := make([]string, 0, len(stmts)) - for _, s := range stmts { - if isTxnControl(s) { - continue - } - out = append(out, s) - } - return out -} - -func snippet(s string) string { - s = strings.TrimSpace(s) - if len(s) > 120 { - return s[:120] + "..." - } - return s -} - -// splitSQLStatements splits a SQL script into statements by semicolon, ignoring semicolons -// inside single/double-quoted strings and skipping comments (-- and /* */). -func splitSQLStatements(in string) []string { - var out []string - var b strings.Builder - - inLineComment := false - inBlockComment := false - inSingle := false - inDouble := false - - runes := []rune(in) - for i := 0; i < len(runes); i++ { - ch := runes[i] - next := rune(0) - if i+1 < len(runes) { - next = runes[i+1] - } - - // Handle end of line comment - if inLineComment { - if ch == '\n' { - inLineComment = false - // keep newline normalization but don't include comment - } - continue - } - // Handle end of block comment - if inBlockComment { - if ch == '*' && next == '/' { - inBlockComment = false - i++ - } - continue - } - - // Start of comments? - if !inSingle && !inDouble { - if ch == '-' && next == '-' { - inLineComment = true - i++ - continue - } - if ch == '/' && next == '*' { - inBlockComment = true - i++ - continue - } - } - - // Quotes - if !inDouble && ch == '\'' { - // Toggle single quotes, respecting escaped '' inside. - if inSingle { - // Check for escaped '' (two single quotes) - if next == '\'' { - b.WriteRune(ch) // write one ' - i++ // skip the next ' - continue - } - inSingle = false - } else { - inSingle = true - } - b.WriteRune(ch) - continue - } - if !inSingle && ch == '"' { - if inDouble { - if next == '"' { - b.WriteRune(ch) - i++ - continue - } - inDouble = false - } else { - inDouble = true - } - b.WriteRune(ch) - continue - } - - // Statement boundary - if ch == ';' && !inSingle && !inDouble { - stmt := strings.TrimSpace(b.String()) - if stmt != "" { - out = append(out, stmt) - } - b.Reset() - continue - } - - b.WriteRune(ch) - } - - // Final fragment - if s := strings.TrimSpace(b.String()); s != "" { - out = append(out, s) - } - return out -} - -// Optional helper to load embedded migrations if you later decide to embed. -// Keep for future use; currently unused. -func readDirFS(fsys fs.FS, root string) ([]string, error) { - var files []string - err := fs.WalkDir(fsys, root, func(path string, d fs.DirEntry, err error) error { - if err != nil { - return err - } - if d.IsDir() { - return nil - } - if strings.HasSuffix(strings.ToLower(d.Name()), ".sql") { - files = append(files, path) - } - return nil - }) - return files, err -} diff --git a/pkg/rqlite/ports.go b/pkg/rqlite/ports.go new file mode 100644 index 0000000..3ff994e --- /dev/null +++ b/pkg/rqlite/ports.go @@ -0,0 +1,208 @@ +package rqlite + +import ( + "errors" + "fmt" + "math/rand" + "net" + "sync" +) + +// PortManager manages port allocation for database instances +type PortManager struct { + allocatedPorts map[int]string // port -> database name + httpRange PortRange + raftRange PortRange + mu sync.RWMutex +} + +// NewPortManager creates a new port manager +func NewPortManager(httpRange, raftRange PortRange) *PortManager { + return &PortManager{ + allocatedPorts: make(map[int]string), + httpRange: httpRange, + raftRange: raftRange, + } +} + +// AllocatePortPair allocates a pair of ports (HTTP and Raft) for a database +func (pm *PortManager) AllocatePortPair(dbName string) (PortPair, error) { + pm.mu.Lock() + defer pm.mu.Unlock() + + // Try up to 20 times to find available ports + for attempt := 0; attempt < 20; attempt++ { + httpPort := pm.randomPortInRange(pm.httpRange) + raftPort := pm.randomPortInRange(pm.raftRange) + + // Check if already allocated + if _, exists := pm.allocatedPorts[httpPort]; exists { + continue + } + if _, exists := pm.allocatedPorts[raftPort]; exists { + continue + } + + // Test if actually bindable + if !pm.canBind(httpPort) || !pm.canBind(raftPort) { + continue + } + + // Allocate the ports + pm.allocatedPorts[httpPort] = dbName + pm.allocatedPorts[raftPort] = dbName + + return PortPair{HTTPPort: httpPort, RaftPort: raftPort}, nil + } + + return PortPair{}, errors.New("no available ports after 20 attempts") +} + +// ReleasePortPair releases a pair of ports back to the pool +func (pm *PortManager) ReleasePortPair(pair PortPair) { + pm.mu.Lock() + defer pm.mu.Unlock() + + delete(pm.allocatedPorts, pair.HTTPPort) + delete(pm.allocatedPorts, pair.RaftPort) +} + +// IsPortPairAvailable checks if a specific port pair is available +func (pm *PortManager) IsPortPairAvailable(pair PortPair) bool { + pm.mu.RLock() + defer pm.mu.RUnlock() + + // Check if ports are in range + if !pm.isInRange(pair.HTTPPort, pm.httpRange) { + return false + } + if !pm.isInRange(pair.RaftPort, pm.raftRange) { + return false + } + + // Check if already allocated + if _, exists := pm.allocatedPorts[pair.HTTPPort]; exists { + return false + } + if _, exists := pm.allocatedPorts[pair.RaftPort]; exists { + return false + } + + // Test if actually bindable + return pm.canBind(pair.HTTPPort) && pm.canBind(pair.RaftPort) +} + +// AllocateSpecificPortPair attempts to allocate a specific port pair +func (pm *PortManager) AllocateSpecificPortPair(dbName string, pair PortPair) error { + pm.mu.Lock() + defer pm.mu.Unlock() + + // Check if ports are in range + if !pm.isInRange(pair.HTTPPort, pm.httpRange) { + return fmt.Errorf("HTTP port %d not in range %d-%d", pair.HTTPPort, pm.httpRange.Start, pm.httpRange.End) + } + if !pm.isInRange(pair.RaftPort, pm.raftRange) { + return fmt.Errorf("Raft port %d not in range %d-%d", pair.RaftPort, pm.raftRange.Start, pm.raftRange.End) + } + + // Check if already allocated + if _, exists := pm.allocatedPorts[pair.HTTPPort]; exists { + return fmt.Errorf("HTTP port %d already allocated", pair.HTTPPort) + } + if _, exists := pm.allocatedPorts[pair.RaftPort]; exists { + return fmt.Errorf("Raft port %d already allocated", pair.RaftPort) + } + + // Test if actually bindable + if !pm.canBind(pair.HTTPPort) { + return fmt.Errorf("HTTP port %d not bindable", pair.HTTPPort) + } + if !pm.canBind(pair.RaftPort) { + return fmt.Errorf("Raft port %d not bindable", pair.RaftPort) + } + + // Allocate the ports + pm.allocatedPorts[pair.HTTPPort] = dbName + pm.allocatedPorts[pair.RaftPort] = dbName + + return nil +} + +// GetAllocatedPorts returns all currently allocated ports +func (pm *PortManager) GetAllocatedPorts() map[int]string { + pm.mu.RLock() + defer pm.mu.RUnlock() + + // Return a copy + copy := make(map[int]string, len(pm.allocatedPorts)) + for port, db := range pm.allocatedPorts { + copy[port] = db + } + return copy +} + +// GetAvailablePortCount returns the approximate number of available ports +func (pm *PortManager) GetAvailablePortCount() int { + pm.mu.RLock() + defer pm.mu.RUnlock() + + httpCount := pm.httpRange.End - pm.httpRange.Start + 1 + raftCount := pm.raftRange.End - pm.raftRange.Start + 1 + + // Return the minimum of the two (since we need pairs) + totalPairs := httpCount + if raftCount < httpCount { + totalPairs = raftCount + } + + return totalPairs - len(pm.allocatedPorts)/2 +} + +// IsPortAllocated checks if a port is currently allocated +func (pm *PortManager) IsPortAllocated(port int) bool { + pm.mu.RLock() + defer pm.mu.RUnlock() + _, exists := pm.allocatedPorts[port] + return exists +} + +// AllocateSpecificPorts allocates specific ports for a database +func (pm *PortManager) AllocateSpecificPorts(dbName string, ports PortPair) error { + pm.mu.Lock() + defer pm.mu.Unlock() + + // Check if ports are already allocated + if _, exists := pm.allocatedPorts[ports.HTTPPort]; exists { + return fmt.Errorf("HTTP port %d already allocated", ports.HTTPPort) + } + if _, exists := pm.allocatedPorts[ports.RaftPort]; exists { + return fmt.Errorf("Raft port %d already allocated", ports.RaftPort) + } + + // Allocate the ports + pm.allocatedPorts[ports.HTTPPort] = dbName + pm.allocatedPorts[ports.RaftPort] = dbName + + return nil +} + +// randomPortInRange returns a random port within the given range +func (pm *PortManager) randomPortInRange(portRange PortRange) int { + return portRange.Start + rand.Intn(portRange.End-portRange.Start+1) +} + +// isInRange checks if a port is within the given range +func (pm *PortManager) isInRange(port int, portRange PortRange) bool { + return port >= portRange.Start && port <= portRange.End +} + +// canBind tests if a port can be bound +func (pm *PortManager) canBind(port int) bool { + // Test bind to check if port is actually available + listener, err := net.Listen("tcp", fmt.Sprintf(":%d", port)) + if err != nil { + return false + } + listener.Close() + return true +} diff --git a/pkg/rqlite/pubsub_messages.go b/pkg/rqlite/pubsub_messages.go new file mode 100644 index 0000000..a052498 --- /dev/null +++ b/pkg/rqlite/pubsub_messages.go @@ -0,0 +1,204 @@ +package rqlite + +import ( + "encoding/json" + "time" +) + +// MessageType represents the type of metadata message +type MessageType string + +const ( + // Database lifecycle + MsgDatabaseCreateRequest MessageType = "DATABASE_CREATE_REQUEST" + MsgDatabaseCreateResponse MessageType = "DATABASE_CREATE_RESPONSE" + MsgDatabaseCreateConfirm MessageType = "DATABASE_CREATE_CONFIRM" + MsgDatabaseStatusUpdate MessageType = "DATABASE_STATUS_UPDATE" + MsgDatabaseDelete MessageType = "DATABASE_DELETE" + + // Hibernation + MsgDatabaseIdleNotification MessageType = "DATABASE_IDLE_NOTIFICATION" + MsgDatabaseShutdownCoordinated MessageType = "DATABASE_SHUTDOWN_COORDINATED" + MsgDatabaseWakeupRequest MessageType = "DATABASE_WAKEUP_REQUEST" + + // Node management + MsgNodeCapacityAnnouncement MessageType = "NODE_CAPACITY_ANNOUNCEMENT" + MsgNodeHealthPing MessageType = "NODE_HEALTH_PING" + MsgNodeHealthPong MessageType = "NODE_HEALTH_PONG" + + // Failure handling + MsgNodeReplacementNeeded MessageType = "NODE_REPLACEMENT_NEEDED" + MsgNodeReplacementOffer MessageType = "NODE_REPLACEMENT_OFFER" + MsgNodeReplacementConfirm MessageType = "NODE_REPLACEMENT_CONFIRM" + MsgDatabaseCleanup MessageType = "DATABASE_CLEANUP" + + // Gossip + MsgMetadataSync MessageType = "METADATA_SYNC" + MsgMetadataChecksumReq MessageType = "METADATA_CHECKSUM_REQUEST" + MsgMetadataChecksumRes MessageType = "METADATA_CHECKSUM_RESPONSE" +) + +// MetadataMessage is the envelope for all metadata messages +type MetadataMessage struct { + Type MessageType `json:"type"` + Timestamp time.Time `json:"timestamp"` + NodeID string `json:"node_id"` // Sender + Payload json.RawMessage `json:"payload"` +} + +// DatabaseCreateRequest is sent when a client wants to create a new database +type DatabaseCreateRequest struct { + DatabaseName string `json:"database_name"` + RequesterNodeID string `json:"requester_node_id"` + ReplicationFactor int `json:"replication_factor"` +} + +// DatabaseCreateResponse is sent by eligible nodes offering to host the database +type DatabaseCreateResponse struct { + DatabaseName string `json:"database_name"` + NodeID string `json:"node_id"` + AvailablePorts PortPair `json:"available_ports"` +} + +// DatabaseCreateConfirm is sent by the coordinator with the final membership +type DatabaseCreateConfirm struct { + DatabaseName string `json:"database_name"` + SelectedNodes []NodeAssignment `json:"selected_nodes"` + CoordinatorNodeID string `json:"coordinator_node_id"` +} + +// NodeAssignment represents a node assignment in a database cluster +type NodeAssignment struct { + NodeID string `json:"node_id"` + HTTPPort int `json:"http_port"` + RaftPort int `json:"raft_port"` + Role string `json:"role"` // "leader" or "follower" +} + +// DatabaseStatusUpdate is sent when a database changes status +type DatabaseStatusUpdate struct { + DatabaseName string `json:"database_name"` + NodeID string `json:"node_id"` + Status DatabaseStatus `json:"status"` + HTTPPort int `json:"http_port,omitempty"` + RaftPort int `json:"raft_port,omitempty"` +} + +// DatabaseIdleNotification is sent when a node detects idle database +type DatabaseIdleNotification struct { + DatabaseName string `json:"database_name"` + NodeID string `json:"node_id"` + LastActivity time.Time `json:"last_activity"` +} + +// DatabaseShutdownCoordinated is sent to coordinate hibernation shutdown +type DatabaseShutdownCoordinated struct { + DatabaseName string `json:"database_name"` + ShutdownTime time.Time `json:"shutdown_time"` // When to actually shutdown +} + +// DatabaseWakeupRequest is sent to wake up a hibernating database +type DatabaseWakeupRequest struct { + DatabaseName string `json:"database_name"` + RequesterNodeID string `json:"requester_node_id"` +} + +// NodeCapacityAnnouncement is sent periodically to announce node capacity +type NodeCapacityAnnouncement struct { + NodeID string `json:"node_id"` + MaxDatabases int `json:"max_databases"` + CurrentDatabases int `json:"current_databases"` + PortRangeHTTP PortRange `json:"port_range_http"` + PortRangeRaft PortRange `json:"port_range_raft"` +} + +// NodeHealthPing is sent periodically for health checks +type NodeHealthPing struct { + NodeID string `json:"node_id"` + CurrentDatabases int `json:"current_databases"` +} + +// NodeHealthPong is the response to a health ping +type NodeHealthPong struct { + NodeID string `json:"node_id"` + Healthy bool `json:"healthy"` + PingFrom string `json:"ping_from"` +} + +// NodeReplacementNeeded is sent when a node failure is detected +type NodeReplacementNeeded struct { + DatabaseName string `json:"database_name"` + FailedNodeID string `json:"failed_node_id"` + CurrentNodes []string `json:"current_nodes"` + ReplicationFactor int `json:"replication_factor"` +} + +// NodeReplacementOffer is sent by nodes offering to replace a failed node +type NodeReplacementOffer struct { + DatabaseName string `json:"database_name"` + NodeID string `json:"node_id"` + AvailablePorts PortPair `json:"available_ports"` +} + +// NodeReplacementConfirm is sent when a replacement node is selected +type NodeReplacementConfirm struct { + DatabaseName string `json:"database_name"` + NewNodeID string `json:"new_node_id"` + ReplacedNodeID string `json:"replaced_node_id"` + NewNodePorts PortPair `json:"new_node_ports"` + JoinAddress string `json:"join_address"` +} + +// DatabaseCleanup is sent to trigger cleanup of orphaned data +type DatabaseCleanup struct { + DatabaseName string `json:"database_name"` + NodeID string `json:"node_id"` + Action string `json:"action"` // e.g., "deleted_orphaned_data" +} + +// MetadataSync contains full database metadata for synchronization +type MetadataSync struct { + Metadata *DatabaseMetadata `json:"metadata"` +} + +// MetadataChecksumRequest requests checksums from other nodes +type MetadataChecksumRequest struct { + RequestID string `json:"request_id"` +} + +// MetadataChecksumResponse contains checksums for all databases +type MetadataChecksumResponse struct { + RequestID string `json:"request_id"` + Checksums []MetadataChecksum `json:"checksums"` +} + +// MarshalMetadataMessage creates a MetadataMessage with the given payload +func MarshalMetadataMessage(msgType MessageType, nodeID string, payload interface{}) ([]byte, error) { + payloadBytes, err := json.Marshal(payload) + if err != nil { + return nil, err + } + + msg := MetadataMessage{ + Type: msgType, + Timestamp: time.Now(), + NodeID: nodeID, + Payload: payloadBytes, + } + + return json.Marshal(msg) +} + +// UnmarshalMetadataMessage parses a MetadataMessage +func UnmarshalMetadataMessage(data []byte) (*MetadataMessage, error) { + var msg MetadataMessage + if err := json.Unmarshal(data, &msg); err != nil { + return nil, err + } + return &msg, nil +} + +// UnmarshalPayload unmarshals the payload into the given type +func (msg *MetadataMessage) UnmarshalPayload(v interface{}) error { + return json.Unmarshal(msg.Payload, v) +} diff --git a/pkg/rqlite/rqlite.go b/pkg/rqlite/rqlite.go deleted file mode 100644 index eca678d..0000000 --- a/pkg/rqlite/rqlite.go +++ /dev/null @@ -1,362 +0,0 @@ -package rqlite - -import ( - "context" - "errors" - "fmt" - "net/http" - "os" - "os/exec" - "path/filepath" - "strings" - "syscall" - "time" - - "github.com/rqlite/gorqlite" - "go.uber.org/zap" - - "github.com/DeBrosOfficial/network/pkg/config" -) - -// RQLiteManager manages an RQLite node instance -type RQLiteManager struct { - config *config.DatabaseConfig - discoverConfig *config.DiscoveryConfig - dataDir string - logger *zap.Logger - cmd *exec.Cmd - connection *gorqlite.Connection -} - -// waitForSQLAvailable waits until a simple query succeeds, indicating a leader is known and queries can be served. -func (r *RQLiteManager) waitForSQLAvailable(ctx context.Context) error { - if r.connection == nil { - r.logger.Error("No rqlite connection") - return errors.New("no rqlite connection") - } - - ticker := time.NewTicker(1 * time.Second) - defer ticker.Stop() - - attempts := 0 - for { - select { - case <-ctx.Done(): - return ctx.Err() - case <-ticker.C: - attempts++ - _, err := r.connection.QueryOne("SELECT 1") - if err == nil { - r.logger.Info("RQLite SQL is available") - return nil - } - if attempts%5 == 0 { // log every ~5s to reduce noise - r.logger.Debug("Waiting for RQLite SQL availability", zap.Error(err)) - } - } - } -} - -// NewRQLiteManager creates a new RQLite manager -func NewRQLiteManager(cfg *config.DatabaseConfig, discoveryCfg *config.DiscoveryConfig, dataDir string, logger *zap.Logger) *RQLiteManager { - return &RQLiteManager{ - config: cfg, - discoverConfig: discoveryCfg, - dataDir: dataDir, - logger: logger, - } -} - -// Start starts the RQLite node -func (r *RQLiteManager) Start(ctx context.Context) error { - // Create data directory - rqliteDataDir := filepath.Join(r.dataDir, "rqlite") - if err := os.MkdirAll(rqliteDataDir, 0755); err != nil { - return fmt.Errorf("failed to create RQLite data directory: %w", err) - } - - if r.discoverConfig.HttpAdvAddress == "" { - return fmt.Errorf("discovery config HttpAdvAddress is empty") - } - - // Build RQLite command - args := []string{ - "-http-addr", fmt.Sprintf("0.0.0.0:%d", r.config.RQLitePort), - "-http-adv-addr", r.discoverConfig.HttpAdvAddress, - "-raft-adv-addr", r.discoverConfig.RaftAdvAddress, - "-raft-addr", fmt.Sprintf("0.0.0.0:%d", r.config.RQLiteRaftPort), - } - - // Add join address if specified (for non-bootstrap or secondary bootstrap nodes) - if r.config.RQLiteJoinAddress != "" { - r.logger.Info("Joining RQLite cluster", zap.String("join_address", r.config.RQLiteJoinAddress)) - - // Normalize join address to host:port for rqlited -join - joinArg := r.config.RQLiteJoinAddress - if strings.HasPrefix(joinArg, "http://") { - joinArg = strings.TrimPrefix(joinArg, "http://") - } else if strings.HasPrefix(joinArg, "https://") { - joinArg = strings.TrimPrefix(joinArg, "https://") - } - - // Wait for join target to become reachable to avoid forming a separate cluster (wait indefinitely) - if err := r.waitForJoinTarget(ctx, joinArg, 0); err != nil { - r.logger.Warn("Join target did not become reachable within timeout; will still attempt to join", - zap.String("join_address", r.config.RQLiteJoinAddress), - zap.Error(err)) - } - - // Always add the join parameter in host:port form - let rqlited handle the rest - args = append(args, "-join", joinArg) - } else { - r.logger.Info("No join address specified - starting as new cluster") - } - - // Add data directory as positional argument - args = append(args, rqliteDataDir) - - r.logger.Info("Starting RQLite node", - zap.String("data_dir", rqliteDataDir), - zap.Int("http_port", r.config.RQLitePort), - zap.Int("raft_port", r.config.RQLiteRaftPort), - zap.String("join_address", r.config.RQLiteJoinAddress), - zap.Strings("full_args", args), - ) - - // Start RQLite process (not bound to ctx for graceful Stop handling) - r.cmd = exec.Command("rqlited", args...) - - // Uncomment if you want to see the stdout/stderr of the RQLite process - // r.cmd.Stdout = os.Stdout - // r.cmd.Stderr = os.Stderr - - if err := r.cmd.Start(); err != nil { - return fmt.Errorf("failed to start RQLite: %w", err) - } - - // Wait for RQLite to be ready - if err := r.waitForReady(ctx); err != nil { - if r.cmd != nil && r.cmd.Process != nil { - _ = r.cmd.Process.Kill() - } - return fmt.Errorf("RQLite failed to become ready: %w", err) - } - - // Create connection - conn, err := gorqlite.Open(fmt.Sprintf("http://localhost:%d", r.config.RQLitePort)) - if err != nil { - if r.cmd != nil && r.cmd.Process != nil { - _ = r.cmd.Process.Kill() - } - return fmt.Errorf("failed to connect to RQLite: %w", err) - } - r.connection = conn - - // Leadership/SQL readiness gating - // - // Fresh bootstrap (no join, no prior state): wait for leadership so queries will work. - // Existing state or joiners: wait for SQL availability (leader known) before proceeding, - // so higher layers (storage) don't fail with 500 leader-not-found. - if r.config.RQLiteJoinAddress == "" && !r.hasExistingState(rqliteDataDir) { - if err := r.waitForLeadership(ctx); err != nil { - if r.cmd != nil && r.cmd.Process != nil { - _ = r.cmd.Process.Kill() - } - return fmt.Errorf("RQLite failed to establish leadership: %w", err) - } - } else { - r.logger.Info("Waiting for RQLite SQL availability (leader discovery)") - if err := r.waitForSQLAvailable(ctx); err != nil { - if r.cmd != nil && r.cmd.Process != nil { - _ = r.cmd.Process.Kill() - } - return fmt.Errorf("RQLite SQL not available: %w", err) - } - } - - // After waitForLeadership / waitForSQLAvailable succeeds, before returning: - migrationsDir := "migrations" - - if err := r.ApplyMigrations(ctx, migrationsDir); err != nil { - r.logger.Error("Migrations failed", zap.Error(err), zap.String("dir", migrationsDir)) - return fmt.Errorf("apply migrations: %w", err) - } - - r.logger.Info("RQLite node started successfully") - return nil -} - -// hasExistingState returns true if the rqlite data directory already contains files or subdirectories. -func (r *RQLiteManager) hasExistingState(rqliteDataDir string) bool { - entries, err := os.ReadDir(rqliteDataDir) - if err != nil { - return false - } - for _, e := range entries { - // Any existing file or directory indicates prior state - if e.Name() == "." || e.Name() == ".." { - continue - } - return true - } - return false -} - -// waitForReady waits for RQLite to be ready to accept connections -func (r *RQLiteManager) waitForReady(ctx context.Context) error { - url := fmt.Sprintf("http://localhost:%d/status", r.config.RQLitePort) - client := &http.Client{Timeout: 2 * time.Second} - - for i := 0; i < 30; i++ { - select { - case <-ctx.Done(): - return ctx.Err() - default: - } - - resp, err := client.Get(url) - if err == nil { - resp.Body.Close() - if resp.StatusCode == http.StatusOK { - return nil - } - } - - time.Sleep(1 * time.Second) - } - - return fmt.Errorf("RQLite did not become ready within timeout") -} - -// waitForLeadership waits for RQLite to establish leadership (for bootstrap nodes) -func (r *RQLiteManager) waitForLeadership(ctx context.Context) error { - r.logger.Info("Waiting for RQLite to establish leadership...") - - for i := 0; i < 30; i++ { - select { - case <-ctx.Done(): - return ctx.Err() - default: - } - - // Try a simple query to check if leadership is established - if r.connection != nil { - _, err := r.connection.QueryOne("SELECT 1") - if err == nil { - r.logger.Info("RQLite leadership established") - return nil - } - r.logger.Debug("Waiting for leadership", zap.Error(err)) - } - - time.Sleep(1 * time.Second) - } - - return fmt.Errorf("RQLite failed to establish leadership within timeout") -} - -// GetConnection returns the RQLite connection -func (r *RQLiteManager) GetConnection() *gorqlite.Connection { - return r.connection -} - -// Stop stops the RQLite node -func (r *RQLiteManager) Stop() error { - if r.connection != nil { - r.connection.Close() - r.connection = nil - } - - if r.cmd == nil || r.cmd.Process == nil { - return nil - } - - r.logger.Info("Stopping RQLite node (graceful)") - // Try SIGTERM first - if err := r.cmd.Process.Signal(syscall.SIGTERM); err != nil { - // Fallback to Kill if signaling fails - _ = r.cmd.Process.Kill() - return nil - } - - // Wait up to 5 seconds for graceful shutdown - done := make(chan error, 1) - go func() { done <- r.cmd.Wait() }() - - select { - case err := <-done: - if err != nil && !errors.Is(err, os.ErrClosed) { - r.logger.Warn("RQLite process exited with error", zap.Error(err)) - } - case <-time.After(5 * time.Second): - r.logger.Warn("RQLite did not exit in time; killing") - _ = r.cmd.Process.Kill() - } - - return nil -} - -// waitForJoinTarget waits until the join target's HTTP status becomes reachable, or until timeout -func (r *RQLiteManager) waitForJoinTarget(ctx context.Context, joinAddress string, timeout time.Duration) error { - var deadline time.Time - if timeout > 0 { - deadline = time.Now().Add(timeout) - } - var lastErr error - - for { - if err := r.testJoinAddress(joinAddress); err == nil { - r.logger.Info("Join target is reachable, proceeding with cluster join") - return nil - } else { - lastErr = err - r.logger.Debug("Join target not yet reachable; waiting...", zap.String("join_address", joinAddress), zap.Error(err)) - } - - // Check context - select { - case <-ctx.Done(): - return ctx.Err() - case <-time.After(2 * time.Second): - } - - if !deadline.IsZero() && time.Now().After(deadline) { - break - } - } - - return lastErr -} - -// testJoinAddress tests if a join address is reachable -func (r *RQLiteManager) testJoinAddress(joinAddress string) error { - // Determine the HTTP status URL to probe. - // If joinAddress contains a scheme, use it directly. Otherwise treat joinAddress - // as host:port (Raft) and probe the standard HTTP API port 5001 on that host. - client := &http.Client{Timeout: 5 * time.Second} - - var statusURL string - if strings.HasPrefix(joinAddress, "http://") || strings.HasPrefix(joinAddress, "https://") { - statusURL = strings.TrimRight(joinAddress, "/") + "/status" - } else { - // Extract host from host:port - host := joinAddress - if idx := strings.Index(joinAddress, ":"); idx != -1 { - host = joinAddress[:idx] - } - statusURL = fmt.Sprintf("http://%s:%d/status", host, 5001) - } - - r.logger.Debug("Testing join target via HTTP", zap.String("url", statusURL)) - resp, err := client.Get(statusURL) - if err != nil { - return fmt.Errorf("failed to connect to leader HTTP at %s: %w", statusURL, err) - } - defer resp.Body.Close() - if resp.StatusCode != http.StatusOK { - return fmt.Errorf("leader HTTP at %s returned status %d", statusURL, resp.StatusCode) - } - - r.logger.Info("Leader HTTP reachable", zap.String("status_url", statusURL)) - return nil -} diff --git a/pkg/rqlite/vector_clock.go b/pkg/rqlite/vector_clock.go new file mode 100644 index 0000000..867cb01 --- /dev/null +++ b/pkg/rqlite/vector_clock.go @@ -0,0 +1,81 @@ +package rqlite + +// VectorClock represents a vector clock for distributed consistency +type VectorClock map[string]uint64 + +// NewVectorClock creates a new vector clock +func NewVectorClock() VectorClock { + return make(VectorClock) +} + +// Increment increments the clock for a given node +func (vc VectorClock) Increment(nodeID string) { + vc[nodeID]++ +} + +// Update updates the vector clock with values from another clock +func (vc VectorClock) Update(other VectorClock) { + for nodeID, value := range other { + if existing, exists := vc[nodeID]; !exists || value > existing { + vc[nodeID] = value + } + } +} + +// Copy creates a copy of the vector clock +func (vc VectorClock) Copy() VectorClock { + copy := make(VectorClock, len(vc)) + for k, v := range vc { + copy[k] = v + } + return copy +} + +// Compare compares two vector clocks +// Returns: -1 if vc < other, 0 if concurrent, 1 if vc > other +func (vc VectorClock) Compare(other VectorClock) int { + less := false + greater := false + + // Check all keys in both clocks + allKeys := make(map[string]bool) + for k := range vc { + allKeys[k] = true + } + for k := range other { + allKeys[k] = true + } + + for k := range allKeys { + v1 := vc[k] + v2 := other[k] + + if v1 < v2 { + less = true + } else if v1 > v2 { + greater = true + } + } + + if less && !greater { + return -1 // vc < other + } else if greater && !less { + return 1 // vc > other + } + return 0 // concurrent +} + +// HappensBefore checks if this clock happens before another +func (vc VectorClock) HappensBefore(other VectorClock) bool { + return vc.Compare(other) == -1 +} + +// HappensAfter checks if this clock happens after another +func (vc VectorClock) HappensAfter(other VectorClock) bool { + return vc.Compare(other) == 1 +} + +// IsConcurrent checks if two clocks are concurrent (neither happens before the other) +func (vc VectorClock) IsConcurrent(other VectorClock) bool { + return vc.Compare(other) == 0 +}