started working on clustering

This commit is contained in:
anonpenguin23 2025-10-13 07:41:46 +03:00
parent 2eb4db3ddb
commit f2d5a0790e
No known key found for this signature in database
GPG Key ID: 1CBB1FE35AFBEE30
28 changed files with 4790 additions and 2699 deletions

View File

@ -0,0 +1,165 @@
<!-- ec358e91-8e19-4fc8-a81e-cb388a4b2fc9 4c357d4a-bae7-4fe2-943d-84e5d3d3714c -->
# Dynamic Database Clustering — Implementation Plan
### Scope
Implement the feature described in `DYNAMIC_DATABASE_CLUSTERING.md`: decentralized metadata via libp2p pubsub, dynamic per-database rqlite clusters (3-node default), idle hibernation/wake-up, node failure replacement, and client UX that exposes `cli.Database(name)` with app namespacing.
### Guiding Principles
- Reuse existing `pkg/pubsub` and `pkg/rqlite` where practical; avoid singletons.
- Backward-compatible config migration with deprecations, feature-flag controlled rollout.
- Strong eventual consistency (vector clocks + periodic gossip) over centralized control planes.
- Tests and observability at each phase.
### Phase 0: Prep & Scaffolding
- Add feature flag `dynamic_db_clustering` (env/config) → default off.
- Introduce config shape for new `database` fields while supporting legacy fields (soft deprecated).
- Create empty packages and interfaces to enable incremental compilation:
- `pkg/metadata/{types.go,manager.go,pubsub.go,consensus.go,vector_clock.go}`
- `pkg/dbcluster/{manager.go,lifecycle.go,subprocess.go,ports.go,health.go,metrics.go}`
- Ensure rqlite subprocess availability (binary path detection, `scripts/install-debros-network.sh` update if needed).
- Establish CI jobs for new unit/integration suites and longer-running e2e.
### Phase 1: Metadata Layer (No hibernation yet)
- Implement metadata types and store (RW locks, versioning) inside `pkg/rqlite/metadata.go`:
- `DatabaseMetadata`, `NodeCapacity`, `PortRange`, `MetadataStore`.
- Pubsub schema and handlers inside `pkg/rqlite/pubsub.go` using existing `pkg/pubsub` bridge:
- Topic `/debros/metadata/v1`; messages for create request/response/confirm, status, node capacity, health.
- Consensus helpers inside `pkg/rqlite/consensus.go` and `pkg/rqlite/vector_clock.go`:
- Deterministic coordinator (lowest peer ID), vector clocks, merge rules, periodic full-state gossip (checksums + fetch diffs).
- Reuse existing node connectivity/backoff; no new ping service required.
- Skip unit tests for now; validate by wiring e2e flows later.
### Phase 2: Database Creation & Client API
- Port management:
- `PortManager` with bind-probing, random allocation within configured ranges; local bookkeeping.
- Subprocess control:
- `RQLiteInstance` lifecycle (start, wait ready via /status and simple query, stop, status).
- Cluster manager:
- `ClusterManager` keeps `activeClusters`, listens to metadata events, executes creation protocol, readiness fan-in, failure surfaces.
- Client API:
- Update `pkg/client/interface.go` to include `Database(name string)`.
- Implement app namespacing in `pkg/client/client.go` (sanitize app name + db name).
- Backoff polling for readiness during creation.
- Data isolation:
- Data dir per db: `./data/<app>_<db>/rqlite` (respect node `data_dir` base).
- Integration tests: create single db across 3 nodes; multiple databases coexisting; cross-node read/write.
### Phase 3: Hibernation & Wake-Up
- Idle detection and coordination:
- Track `LastQuery` per instance; periodic scan; all-nodes-idle quorum → coordinated shutdown schedule.
- Hibernation protocol:
- Broadcast idle notices, coordinator schedules `DATABASE_SHUTDOWN_COORDINATED`, graceful SIGTERM, ports freed, status → `hibernating`.
- Wake-up protocol:
- Client detects `hibernating`, performs CAS to `waking`, triggers wake request; port reuse if available else re-negotiate; start instances; status → `active`.
- Client retry UX:
- Transparent retries with exponential backoff; treat `waking` as wait-only state.
- Tests: hibernation under load; thundering herd; resource verification and persistence across cycles.
### Phase 4: Resilience (Failure & Replacement)
- Continuous health checks with timeouts → mark node unhealthy.
- Replacement orchestration:
- Coordinator initiates `NODE_REPLACEMENT_NEEDED`, eligible nodes respond, confirm selection, new node joins raft via `-join` then syncs.
- Startup reconciliation:
- Detect and cleanup orphaned or non-member local data directories.
- Rate limiting replacements to prevent cascades; prioritize by usage metrics.
- Tests: forced crashes, partitions, replacement within target SLO; reconciliation sanity.
### Phase 5: Production Hardening & Optimization
- Metrics/logging:
- Structured logs with trace IDs; counters for queries/min, hibernations, wake-ups, replacements; health and capacity gauges.
- Config validation, replication factor settings (1,3,5), and debugging APIs (read-only metadata dump, node status).
- Client metadata caching and query routing improvements (simple round-robin → latency-aware later).
- Performance benchmarks and operator-facing docs.
### File Changes (Essentials)
- `pkg/config/config.go`
- Remove (deprecate, then delete): `Database.DataDir`, `RQLitePort`, `RQLiteRaftPort`, `RQLiteJoinAddress`.
- Add: `ReplicationFactor int`, `HibernationTimeout time.Duration`, `MaxDatabases int`, `PortRange {HTTPStart, HTTPEnd, RaftStart, RaftEnd int}`, `Discovery.HealthCheckInterval`.
- `pkg/client/interface.go`/`pkg/client/client.go`
- Add `Database(name string)` and app namespace requirement (`DefaultClientConfig(appName)`); backoff polling.
- `pkg/node/node.go`
- Wire `metadata.Manager` and `dbcluster.ClusterManager`; remove direct rqlite singleton usage.
- `pkg/rqlite/*`
- Refactor to instance-oriented helpers from singleton.
- New packages under `pkg/metadata` and `pkg/dbcluster` as listed above.
- `configs/node.yaml` and validation paths to reflect new `database` block.
### Config Example (target end-state)
```yaml
node:
data_dir: "./data"
database:
replication_factor: 3
hibernation_timeout: 60
max_databases: 100
port_range:
http_start: 5001
http_end: 5999
raft_start: 7001
raft_end: 7999
discovery:
health_check_interval: 10s
```
### Rollout Strategy
- Keep feature flag off by default; support legacy single-cluster path.
- Ship Phase 1 behind flag; enable in dev/e2e only.
- Incrementally enable creation (Phase 2), then hibernation (Phase 3) per environment.
- Remove legacy config after deprecation window.
### Testing & Quality Gates
- Unit tests: metadata ops, consensus, ports, subprocess, manager state machine.
- Integration tests under `e2e/` for creation, isolation, hibernation, failure handling, partitions.
- Benchmarks for creation (<10s), wake-up (<8s), metadata sync (<5s), query overhead (<10ms).
- Chaos suite for randomized failures and partitions.
### Risks & Mitigations (operationalized)
- Metadata divergence → vector clocks + periodic checksums + majority read checks in client.
- Raft churn → adaptive timeouts; allow `always_on` flag per-db (future).
- Cascading replacements → global rate limiter and prioritization.
- Debuggability → verbose structured logging and metadata dump endpoints.
### Timeline (indicative)
- Weeks 1-2: Phases 0-1
- Weeks 3-4: Phase 2
- Weeks 5-6: Phase 3
- Weeks 7-8: Phase 4
- Weeks 9-10+: Phase 5
### To-dos
- [ ] Add feature flag, scaffold packages, CI jobs, rqlite binary checks
- [ ] Extend `pkg/config/config.go` and YAML schemas; deprecate legacy fields
- [ ] Implement metadata types and thread-safe store with versioning
- [ ] Implement pubsub messages and handlers using existing pubsub manager
- [ ] Implement coordinator election, vector clocks, gossip reconciliation
- [ ] Implement `PortManager` with bind-probing and allocation
- [ ] Implement rqlite subprocess control and readiness checks
- [ ] Implement `ClusterManager` and creation lifecycle orchestration
- [ ] Add `Database(name)` and app namespacing to client; backoff polling
- [ ] Adopt per-database data dirs under node `data_dir`
- [ ] Integration tests for creation and isolation across nodes
- [ ] Idle detection, coordinated shutdown, status updates
- [ ] Wake-up CAS to `waking`, port reuse/renegotiation, restart
- [ ] Client transparent retry/backoff for hibernation and waking
- [ ] Health checks, replacement orchestration, rate limiting
- [ ] Implement orphaned data reconciliation on startup
- [ ] Add metrics and structured logging across managers
- [ ] Benchmarks for creation, wake-up, sync, query overhead
- [ ] Operator and developer docs; config and migration guides

504
DYNAMIC_CLUSTERING_GUIDE.md Normal file
View File

@ -0,0 +1,504 @@
# Dynamic Database Clustering - User Guide
## Overview
Dynamic Database Clustering enables on-demand creation of isolated, replicated rqlite database clusters with automatic resource management through hibernation. Each database runs as a separate 3-node cluster with its own data directory and port allocation.
## Key Features
**Multi-Database Support** - Create unlimited isolated databases on-demand
**3-Node Replication** - Fault-tolerant by default (configurable)
**Auto Hibernation** - Idle databases hibernate to save resources
**Transparent Wake-Up** - Automatic restart on access
**App Namespacing** - Databases are scoped by application name
**Decentralized Metadata** - LibP2P pubsub-based coordination
**Failure Recovery** - Automatic node replacement on failures
**Resource Optimization** - Dynamic port allocation and data isolation
## Configuration
### Node Configuration (`configs/node.yaml`)
```yaml
node:
data_dir: "./data"
listen_addresses:
- "/ip4/0.0.0.0/tcp/4001"
max_connections: 50
database:
replication_factor: 3 # Number of replicas per database
hibernation_timeout: 60s # Idle time before hibernation
max_databases: 100 # Max databases per node
port_range_http_start: 5001 # HTTP port range start
port_range_http_end: 5999 # HTTP port range end
port_range_raft_start: 7001 # Raft port range start
port_range_raft_end: 7999 # Raft port range end
discovery:
bootstrap_peers:
- "/ip4/127.0.0.1/tcp/4001/p2p/..."
discovery_interval: 30s
health_check_interval: 10s
```
### Key Configuration Options
#### `database.replication_factor` (default: 3)
Number of nodes that will host each database cluster. Minimum 1, recommended 3 for fault tolerance.
#### `database.hibernation_timeout` (default: 60s)
Time of inactivity before a database is hibernated. Set to 0 to disable hibernation.
#### `database.max_databases` (default: 100)
Maximum number of databases this node can host simultaneously.
#### `database.port_range_*`
Port ranges for dynamic allocation. Ensure ranges are large enough for `max_databases * 2` ports (HTTP + Raft per database).
## Client Usage
### Creating/Accessing Databases
```go
package main
import (
"context"
"github.com/DeBrosOfficial/network/pkg/client"
)
func main() {
// Create client with app name for namespacing
cfg := client.DefaultClientConfig("myapp")
cfg.BootstrapPeers = []string{
"/ip4/127.0.0.1/tcp/4001/p2p/...",
}
c, err := client.NewClient(cfg)
if err != nil {
panic(err)
}
// Connect to network
if err := c.Connect(); err != nil {
panic(err)
}
defer c.Disconnect()
// Get database client (creates database if it doesn't exist)
db, err := c.Database().Database("users")
if err != nil {
panic(err)
}
// Use the database
ctx := context.Background()
err = db.CreateTable(ctx, `
CREATE TABLE users (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
email TEXT UNIQUE
)
`)
// Query data
result, err := db.Query(ctx, "SELECT * FROM users")
// ...
}
```
### Database Naming
Databases are automatically namespaced by your application name:
- `client.Database("users")` → creates `myapp_users` internally
- This prevents name collisions between different applications
## Gateway API Usage
If you prefer HTTP/REST API access instead of the Go client, you can use the gateway endpoints:
### Base URL
```
http://gateway-host:8080/v1/database/
```
### Execute SQL (INSERT, UPDATE, DELETE, DDL)
```bash
POST /v1/database/exec
Content-Type: application/json
{
"database": "users",
"sql": "INSERT INTO users (name, email) VALUES (?, ?)",
"args": ["Alice", "alice@example.com"]
}
Response:
{
"rows_affected": 1,
"last_insert_id": 1
}
```
### Query Data (SELECT)
```bash
POST /v1/database/query
Content-Type: application/json
{
"database": "users",
"sql": "SELECT * FROM users WHERE name LIKE ?",
"args": ["A%"]
}
Response:
{
"items": [
{"id": 1, "name": "Alice", "email": "alice@example.com"}
],
"count": 1
}
```
### Execute Transaction
```bash
POST /v1/database/transaction
Content-Type: application/json
{
"database": "users",
"queries": [
"INSERT INTO users (name, email) VALUES ('Bob', 'bob@example.com')",
"UPDATE users SET email = 'alice.new@example.com' WHERE name = 'Alice'"
]
}
Response:
{
"success": true
}
```
### Get Schema
```bash
GET /v1/database/schema?database=users
# OR
POST /v1/database/schema
Content-Type: application/json
{
"database": "users"
}
Response:
{
"tables": [
{
"name": "users",
"columns": ["id", "name", "email"]
}
]
}
```
### Create Table
```bash
POST /v1/database/create-table
Content-Type: application/json
{
"database": "users",
"schema": "CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT, email TEXT)"
}
Response:
{
"rows_affected": 0
}
```
### Drop Table
```bash
POST /v1/database/drop-table
Content-Type: application/json
{
"database": "users",
"table_name": "old_table"
}
Response:
{
"rows_affected": 0
}
```
### List Databases
```bash
GET /v1/database/list
Response:
{
"databases": ["users", "products", "orders"]
}
```
### Important Notes
1. **Authentication Required**: All endpoints require authentication (JWT or API key)
2. **Database Creation**: Databases are created automatically on first access
3. **Hibernation**: The gateway handles hibernation/wake-up transparently - you may experience a delay (< 8s) on first query to a hibernating database
4. **Timeouts**: Query timeout is 30s, transaction timeout is 60s
5. **Namespacing**: Database names are automatically prefixed with your app name
6. **Concurrent Access**: All endpoints are safe for concurrent use
## Database Lifecycle
### 1. Creation
When you first access a database:
1. **Request Broadcast** - Node broadcasts `DATABASE_CREATE_REQUEST`
2. **Node Selection** - Eligible nodes respond with available ports
3. **Coordinator Selection** - Deterministic coordinator (lowest peer ID) chosen
4. **Confirmation** - Coordinator selects nodes and broadcasts `DATABASE_CREATE_CONFIRM`
5. **Instance Startup** - Selected nodes start rqlite subprocesses
6. **Readiness** - Nodes report `active` status when ready
**Typical creation time: < 10 seconds**
### 2. Active State
- Database instances run as rqlite subprocesses
- Each instance tracks `LastQuery` timestamp
- Queries update the activity timestamp
- Metadata synced across all network nodes
### 3. Hibernation
After `hibernation_timeout` of inactivity:
1. **Idle Detection** - Nodes detect idle databases
2. **Idle Notification** - Nodes broadcast idle status
3. **Coordinated Shutdown** - When all nodes report idle, coordinator schedules shutdown
4. **Graceful Stop** - SIGTERM sent to rqlite processes
5. **Port Release** - Ports freed for reuse
6. **Status Update** - Metadata updated to `hibernating`
**Data persists on disk during hibernation**
### 4. Wake-Up
On first query to hibernating database:
1. **Detection** - Client/node detects `hibernating` status
2. **Wake Request** - Broadcast `DATABASE_WAKEUP_REQUEST`
3. **Port Allocation** - Reuse original ports or allocate new ones
4. **Instance Restart** - Restart rqlite with existing data
5. **Status Update** - Update to `active` when ready
**Typical wake-up time: < 8 seconds**
### 5. Failure Recovery
When a node fails:
1. **Health Detection** - Missed health checks trigger failure detection
2. **Replacement Request** - Surviving nodes broadcast `NODE_REPLACEMENT_NEEDED`
3. **Offers** - Healthy nodes with capacity offer to replace
4. **Selection** - First offer accepted (simple approach)
5. **Join Cluster** - New node joins existing Raft cluster
6. **Sync** - Data synced from existing members
## Data Management
### Data Directories
Each database gets its own data directory:
```
./data/
├── myapp_users/ # Database: users
│ └── rqlite/
│ ├── db.sqlite
│ └── raft/
├── myapp_products/ # Database: products
│ └── rqlite/
└── myapp_orders/ # Database: orders
└── rqlite/
```
### Orphaned Data Cleanup
On node startup, the system automatically:
- Scans data directories
- Checks against metadata
- Removes directories for:
- Non-existent databases
- Databases where this node is not a member
## Monitoring & Debugging
### Structured Logging
All operations are logged with structured fields:
```
INFO Starting cluster manager node_id=12D3... max_databases=100
INFO Received database create request database=myapp_users requester=12D3...
INFO Database instance started database=myapp_users http_port=5001 raft_port=7001
INFO Database is idle database=myapp_users idle_time=62s
INFO Database hibernated successfully database=myapp_users
INFO Received wakeup request database=myapp_users
INFO Database woke up successfully database=myapp_users
```
### Health Checks
Nodes perform periodic health checks:
- Every `health_check_interval` (default: 10s)
- Tracks last-seen time for each peer
- 3 missed checks → node marked unhealthy
- Triggers replacement protocol for affected databases
## Best Practices
### 1. **Capacity Planning**
```yaml
# For 100 databases with 3-node replication:
database:
max_databases: 100
port_range_http_start: 5001
port_range_http_end: 5200 # 200 ports (100 databases * 2)
port_range_raft_start: 7001
port_range_raft_end: 7200
```
### 2. **Hibernation Tuning**
- **High Traffic**: Set `hibernation_timeout: 300s` or higher
- **Development**: Set `hibernation_timeout: 30s` for faster cycles
- **Always-On DBs**: Set `hibernation_timeout: 0` to disable
### 3. **Replication Factor**
- **Development**: `replication_factor: 1` (single node, no replication)
- **Production**: `replication_factor: 3` (fault tolerant)
- **High Availability**: `replication_factor: 5` (survives 2 failures)
### 4. **Network Topology**
- Use at least 3 nodes for `replication_factor: 3`
- Ensure `max_databases * replication_factor <= total_cluster_capacity`
- Example: 3 nodes × 100 max_databases = 300 database instances total
## Troubleshooting
### Database Creation Fails
**Problem**: `insufficient nodes responded: got 1, need 3`
**Solution**:
- Ensure you have at least `replication_factor` nodes online
- Check `max_databases` limit on nodes
- Verify port ranges aren't exhausted
### Database Not Waking Up
**Problem**: Database stays in `waking` status
**Solution**:
- Check node logs for rqlite startup errors
- Verify rqlite binary is installed
- Check port conflicts (use different port ranges)
- Ensure data directory is accessible
### Orphaned Data
**Problem**: Disk space consumed by old databases
**Solution**:
- Orphaned data is automatically cleaned on node restart
- Manual cleanup: Delete directories from `./data/` that don't match metadata
- Check logs for reconciliation results
### Node Replacement Not Working
**Problem**: Failed node not replaced
**Solution**:
- Ensure remaining nodes have capacity (`CurrentDatabases < MaxDatabases`)
- Check network connectivity between nodes
- Verify health check interval is reasonable (not too aggressive)
## Advanced Topics
### Metadata Consistency
- **Vector Clocks**: Each metadata update includes vector clock for conflict resolution
- **Gossip Protocol**: Periodic metadata sync via checksums
- **Eventual Consistency**: All nodes eventually agree on database state
### Port Management
- Ports allocated randomly within configured ranges
- Bind-probing ensures ports are actually available
- Ports reused during wake-up when possible
- Failed allocations fall back to new random ports
### Coordinator Election
- Deterministic selection based on lexicographical peer ID ordering
- Lowest peer ID becomes coordinator
- No persistent coordinator state
- Re-election occurs for each database operation
## Migration from Legacy Mode
If upgrading from single-cluster rqlite:
1. **Backup Data**: Backup your existing `./data/rqlite` directory
2. **Update Config**: Remove deprecated fields:
- `database.data_dir`
- `database.rqlite_port`
- `database.rqlite_raft_port`
- `database.rqlite_join_address`
3. **Add New Fields**: Configure dynamic clustering (see Configuration section)
4. **Restart Nodes**: Restart all nodes with new configuration
5. **Migrate Data**: Create new database and import data from backup
## Future Enhancements
The following features are planned for future releases:
### **Advanced Metrics** (Future)
- Prometheus-style metrics export
- Per-database query counters
- Hibernation/wake-up latency histograms
- Resource utilization gauges
### **Performance Benchmarks** (Future)
- Automated benchmark suite
- Creation time SLOs
- Wake-up latency targets
- Query overhead measurements
### **Enhanced Monitoring** (Future)
- Dashboard for cluster visualization
- Database status API endpoint
- Capacity planning tools
- Alerting integration
## Support
For issues, questions, or contributions:
- GitHub Issues: https://github.com/DeBrosOfficial/network/issues
- Documentation: https://github.com/DeBrosOfficial/network/blob/main/DYNAMIC_DATABASE_CLUSTERING.md
## License
See LICENSE file for details.

View File

@ -21,7 +21,7 @@ test-e2e:
.PHONY: build clean test run-node run-node2 run-node3 run-example deps tidy fmt vet lint clear-ports
VERSION := 0.51.0-beta
VERSION := 0.60.0-beta
COMMIT ?= $(shell git rev-parse --short HEAD 2>/dev/null || echo unknown)
DATE ?= $(shell date -u +%Y-%m-%dT%H:%M:%SZ)
LDFLAGS := -X 'main.version=$(VERSION)' -X 'main.commit=$(COMMIT)' -X 'main.date=$(DATE)'
@ -53,13 +53,25 @@ run-node:
# Usage: make run-node2 JOINADDR=/ip4/127.0.0.1/tcp/5001 HTTP=5002 RAFT=7002 P2P=4002
run-node2:
@echo "Starting regular node2 with config..."
go run ./cmd/node --config configs/node.yaml
go run ./cmd/node --config configs/node.yaml -id node2 -p2p-port 4002
# Run third node (regular) - requires join address of bootstrap node
# Usage: make run-node3 JOINADDR=/ip4/127.0.0.1/tcp/5001 HTTP=5003 RAFT=7003 P2P=4003
run-node3:
@echo "Starting regular node3 with config..."
go run ./cmd/node --config configs/node.yaml
go run ./cmd/node --config configs/node.yaml -id node3 -p2p-port 4003
run-node4:
@echo "Starting regular node4 with config..."
go run ./cmd/node --config configs/node.yaml -id node4 -p2p-port 4004
run-node5:
@echo "Starting regular node5 with config..."
go run ./cmd/node --config configs/node.yaml -id node5 -p2p-port 4005
run-node6:
@echo "Starting regular node6 with config..."
go run ./cmd/node --config configs/node.yaml -id node6 -p2p-port 4006
# Run gateway HTTP server
# Usage examples:

827
TESTING_GUIDE.md Normal file
View File

@ -0,0 +1,827 @@
# Dynamic Database Clustering - Testing Guide
This guide provides a comprehensive list of unit tests, integration tests, and manual tests needed to verify the dynamic database clustering feature.
## Unit Tests
### 1. Metadata Store Tests (`pkg/rqlite/metadata_test.go`)
```go
// Test cases to implement:
func TestMetadataStore_GetSetDatabase(t *testing.T)
- Create store
- Set database metadata
- Get database metadata
- Verify data matches
func TestMetadataStore_DeleteDatabase(t *testing.T)
- Set database metadata
- Delete database
- Verify Get returns nil
func TestMetadataStore_ListDatabases(t *testing.T)
- Add multiple databases
- List all databases
- Verify count and contents
func TestMetadataStore_ConcurrentAccess(t *testing.T)
- Spawn multiple goroutines
- Concurrent reads and writes
- Verify no race conditions (run with -race)
func TestMetadataStore_NodeCapacity(t *testing.T)
- Set node capacity
- Get node capacity
- Update capacity
- List nodes
```
### 2. Vector Clock Tests (`pkg/rqlite/vector_clock_test.go`)
```go
func TestVectorClock_Increment(t *testing.T)
- Create empty vector clock
- Increment for node A
- Verify counter is 1
- Increment again
- Verify counter is 2
func TestVectorClock_Merge(t *testing.T)
- Create two vector clocks with different nodes
- Merge them
- Verify max values are preserved
func TestVectorClock_Compare(t *testing.T)
- Test strictly less than case
- Test strictly greater than case
- Test concurrent case
- Test identical case
func TestVectorClock_Concurrent(t *testing.T)
- Create clocks with overlapping updates
- Verify Compare returns 0 (concurrent)
```
### 3. Consensus Tests (`pkg/rqlite/consensus_test.go`)
```go
func TestElectCoordinator_SingleNode(t *testing.T)
- Pass single node ID
- Verify it's elected
func TestElectCoordinator_MultipleNodes(t *testing.T)
- Pass multiple node IDs
- Verify lowest lexicographical ID wins
- Verify deterministic (same input = same output)
func TestElectCoordinator_EmptyList(t *testing.T)
- Pass empty list
- Verify error returned
func TestElectCoordinator_Deterministic(t *testing.T)
- Run election multiple times with same inputs
- Verify same coordinator each time
```
### 4. Port Manager Tests (`pkg/rqlite/ports_test.go`)
```go
func TestPortManager_AllocatePortPair(t *testing.T)
- Create manager with port range
- Allocate port pair
- Verify HTTP and Raft ports different
- Verify ports within range
func TestPortManager_ReleasePortPair(t *testing.T)
- Allocate port pair
- Release ports
- Verify ports can be reallocated
func TestPortManager_Exhaustion(t *testing.T)
- Allocate all available ports
- Attempt one more allocation
- Verify error returned
func TestPortManager_IsPortAllocated(t *testing.T)
- Allocate ports
- Check IsPortAllocated returns true
- Release ports
- Check IsPortAllocated returns false
func TestPortManager_AllocateSpecificPorts(t *testing.T)
- Allocate specific ports
- Verify allocation succeeds
- Attempt to allocate same ports again
- Verify error returned
```
### 5. RQLite Instance Tests (`pkg/rqlite/instance_test.go`)
```go
func TestRQLiteInstance_Create(t *testing.T)
- Create instance configuration
- Verify fields set correctly
func TestRQLiteInstance_IsIdle(t *testing.T)
- Set LastQuery to old timestamp
- Verify IsIdle returns true
- Update LastQuery
- Verify IsIdle returns false
// Integration test (requires rqlite binary):
func TestRQLiteInstance_StartStop(t *testing.T)
- Create instance
- Start instance
- Verify HTTP endpoint responsive
- Stop instance
- Verify process terminated
```
### 6. Pubsub Message Tests (`pkg/rqlite/pubsub_messages_test.go`)
```go
func TestMarshalUnmarshalMetadataMessage(t *testing.T)
- Create each message type
- Marshal to bytes
- Unmarshal back
- Verify data preserved
func TestDatabaseCreateRequest_Marshal(t *testing.T)
func TestDatabaseCreateResponse_Marshal(t *testing.T)
func TestDatabaseCreateConfirm_Marshal(t *testing.T)
func TestDatabaseStatusUpdate_Marshal(t *testing.T)
// ... for all message types
```
### 7. Coordinator Tests (`pkg/rqlite/coordinator_test.go`)
```go
func TestCreateCoordinator_AddResponse(t *testing.T)
- Create coordinator
- Add responses
- Verify response count
func TestCreateCoordinator_SelectNodes(t *testing.T)
- Add more responses than needed
- Call SelectNodes
- Verify correct number selected
- Verify deterministic selection
func TestCreateCoordinator_WaitForResponses(t *testing.T)
- Create coordinator
- Wait in goroutine
- Add responses from another goroutine
- Verify wait completes when enough responses
func TestCoordinatorRegistry(t *testing.T)
- Register coordinator
- Get coordinator
- Remove coordinator
- Verify lifecycle
```
## Integration Tests
### 1. Single Node Database Creation (`e2e/single_node_database_test.go`)
```go
func TestSingleNodeDatabaseCreation(t *testing.T)
- Start 1 node
- Set replication_factor = 1
- Create database
- Verify database active
- Write data
- Read data back
- Verify data matches
```
### 2. Three Node Database Creation (`e2e/three_node_database_test.go`)
```go
func TestThreeNodeDatabaseCreation(t *testing.T)
- Start 3 nodes
- Set replication_factor = 3
- Create database from node 1
- Wait for all nodes to report active
- Write data to node 1
- Read from node 2
- Verify replication worked
```
### 3. Multiple Databases (`e2e/multiple_databases_test.go`)
```go
func TestMultipleDatabases(t *testing.T)
- Start 3 nodes
- Create database "users"
- Create database "products"
- Create database "orders"
- Verify all databases active
- Write to each database
- Verify data isolation
```
### 4. Hibernation Cycle (`e2e/hibernation_test.go`)
```go
func TestHibernationCycle(t *testing.T)
- Start 3 nodes with hibernation_timeout=5s
- Create database
- Write initial data
- Wait 10 seconds (no activity)
- Verify status = hibernating
- Verify processes stopped
- Verify data persisted on disk
func TestWakeUpCycle(t *testing.T)
- Create and hibernate database
- Issue query
- Wait for wake-up
- Verify status = active
- Verify data still accessible
- Verify LastQuery updated
```
### 5. Node Failure and Recovery (`e2e/failure_recovery_test.go`)
```go
func TestNodeFailureDetection(t *testing.T)
- Start 3 nodes
- Create database
- Kill one node (SIGKILL)
- Wait for health checks to detect failure
- Verify NODE_REPLACEMENT_NEEDED broadcast
func TestNodeReplacement(t *testing.T)
- Start 4 nodes
- Create database on nodes 1,2,3
- Kill node 3
- Wait for replacement
- Verify node 4 joins cluster
- Verify data accessible from node 4
```
### 6. Orphaned Data Cleanup (`e2e/cleanup_test.go`)
```go
func TestOrphanedDataCleanup(t *testing.T)
- Start node
- Manually create orphaned data directory
- Restart node
- Verify orphaned directory removed
- Check logs for reconciliation message
```
### 7. Concurrent Operations (`e2e/concurrent_test.go`)
```go
func TestConcurrentDatabaseCreation(t *testing.T)
- Start 5 nodes
- Create 10 databases concurrently
- Verify all successful
- Verify no port conflicts
- Verify proper distribution
func TestConcurrentHibernation(t *testing.T)
- Create multiple databases
- Let all go idle
- Verify all hibernate correctly
- No race conditions
```
## Manual Test Scenarios
### Test 1: Basic Flow - Three Node Cluster
**Setup:**
```bash
# Terminal 1: Bootstrap node
cd data/bootstrap
../../bin/node --data bootstrap --id bootstrap --p2p-port 4001
# Terminal 2: Node 2
cd data/node
../../bin/node --data node --id node2 --p2p-port 4002
# Terminal 3: Node 3
cd data/node2
../../bin/node --data node2 --id node3 --p2p-port 4003
```
**Test Steps:**
1. **Create Database**
```bash
# Use client or API to create database "testdb"
```
2. **Verify Creation**
- Check logs on all 3 nodes for "Database instance started"
- Verify `./data/*/testdb/` directories exist on all nodes
- Check different ports allocated on each node
3. **Write Data**
```sql
CREATE TABLE users (id INT, name TEXT);
INSERT INTO users VALUES (1, 'Alice');
INSERT INTO users VALUES (2, 'Bob');
```
4. **Verify Replication**
- Query from each node
- Verify same data returned
**Expected Results:**
- All nodes show `status=active` for testdb
- Data replicated across all nodes
- Unique port pairs per node
---
### Test 2: Hibernation and Wake-Up
**Setup:** Same as Test 1 with database created
**Test Steps:**
1. **Check Activity**
```bash
# In logs, verify "last_query" timestamps updating on queries
```
2. **Wait for Hibernation**
- Stop issuing queries
- Wait `hibernation_timeout` + 10s
- Check logs for "Database is idle"
- Verify "Coordinated shutdown message sent"
- Verify "Database hibernated successfully"
3. **Verify Hibernation**
```bash
# Check that rqlite processes are stopped
ps aux | grep rqlite
# Verify data directories still exist
ls -la data/*/testdb/
```
4. **Wake Up**
- Issue a query to the database
- Watch logs for "Received wakeup request"
- Verify "Database woke up successfully"
- Verify query succeeds
**Expected Results:**
- Hibernation happens after idle timeout
- All 3 nodes hibernate coordinated
- Wake-up completes in < 8 seconds
- Data persists across hibernation cycle
---
### Test 3: Multiple Databases
**Setup:** 3 nodes running
**Test Steps:**
1. **Create Multiple Databases**
```
Create: users_db
Create: products_db
Create: orders_db
```
2. **Verify Isolation**
- Insert data in users_db
- Verify data NOT in products_db
- Verify data NOT in orders_db
3. **Check Port Allocation**
```bash
# Verify different ports for each database
netstat -tlnp | grep rqlite
# OR
ss -tlnp | grep rqlite
```
4. **Verify Data Directories**
```bash
tree data/bootstrap/
# Should show:
# ├── users_db/
# ├── products_db/
# └── orders_db/
```
**Expected Results:**
- 3 separate database clusters
- Each with 3 nodes (9 total instances)
- Complete data isolation
- Unique port pairs for each instance
---
### Test 4: Node Failure and Recovery
**Setup:** 4 nodes running, database created on nodes 1-3
**Test Steps:**
1. **Verify Initial State**
- Database active on nodes 1, 2, 3
- Node 4 idle
2. **Simulate Failure**
```bash
# Kill node 3 (SIGKILL for unclean shutdown)
kill -9 <node3_pid>
```
3. **Watch for Detection**
- Check logs on nodes 1 and 2
- Wait for health check failures (3 missed pings)
- Verify "Node detected as unhealthy" messages
4. **Watch for Replacement**
- Check for "NODE_REPLACEMENT_NEEDED" broadcast
- Node 4 should offer to replace
- Verify "Starting as replacement node" on node 4
- Verify node 4 joins Raft cluster
5. **Verify Data Integrity**
- Query database from node 4
- Verify all data present
- Insert new data from node 4
- Verify replication to nodes 1 and 2
**Expected Results:**
- Failure detected within 30 seconds
- Replacement completes automatically
- Data accessible from new node
- No data loss
---
### Test 5: Port Exhaustion
**Setup:** 1 node with small port range
**Configuration:**
```yaml
database:
max_databases: 10
port_range_http_start: 5001
port_range_http_end: 5005 # Only 5 ports
port_range_raft_start: 7001
port_range_raft_end: 7005 # Only 5 ports
```
**Test Steps:**
1. **Create Databases**
- Create database 1 (succeeds - uses 2 ports)
- Create database 2 (succeeds - uses 2 ports)
- Create database 3 (fails - only 1 port left)
2. **Verify Error**
- Check logs for "Cannot allocate ports"
- Verify error returned to client
3. **Free Ports**
- Hibernate or delete database 1
- Ports should be freed
4. **Retry**
- Create database 3 again
- Should succeed now
**Expected Results:**
- Graceful handling of port exhaustion
- Clear error messages
- Ports properly recycled
---
### Test 6: Orphaned Data Cleanup
**Setup:** 1 node stopped
**Test Steps:**
1. **Create Orphaned Data**
```bash
# While node is stopped
mkdir -p data/bootstrap/orphaned_db/rqlite
echo "fake data" > data/bootstrap/orphaned_db/rqlite/db.sqlite
```
2. **Start Node**
```bash
./bin/node --data bootstrap --id bootstrap
```
3. **Check Reconciliation**
- Watch logs for "Starting orphaned data reconciliation"
- Verify "Found orphaned database directory"
- Verify "Removed orphaned database directory"
4. **Verify Cleanup**
```bash
ls data/bootstrap/
# orphaned_db should be gone
```
**Expected Results:**
- Orphaned directories automatically detected
- Removed on startup
- Clean reconciliation logged
---
### Test 7: Stress Test - Many Databases
**Setup:** 5 nodes with high capacity
**Configuration:**
```yaml
database:
max_databases: 50
port_range_http_start: 5001
port_range_http_end: 5150
port_range_raft_start: 7001
port_range_raft_end: 7150
```
**Test Steps:**
1. **Create Many Databases**
```
Loop: Create databases db_1 through db_25
```
2. **Verify Distribution**
- Check logs for node capacity announcements
- Verify databases distributed across nodes
- No single node overloaded
3. **Concurrent Operations**
- Write to multiple databases simultaneously
- Read from multiple databases
- Verify no conflicts
4. **Hibernation Wave**
- Stop all activity
- Wait for hibernation
- Verify all databases hibernate
- Check resource usage drops
5. **Wake-Up Storm**
- Query all 25 databases at once
- Verify all wake up successfully
- Check for thundering herd issues
**Expected Results:**
- All 25 databases created successfully
- Even distribution across nodes
- No port conflicts
- Successful mass hibernation/wake-up
---
### Test 8: Gateway API Access
**Setup:** Gateway running with 3 nodes
**Test Steps:**
1. **Authenticate**
```bash
# Get JWT token
TOKEN=$(curl -X POST http://localhost:8080/v1/auth/login \
-H "Content-Type: application/json" \
-d '{"wallet": "..."}' | jq -r .token)
```
2. **Create Table**
```bash
curl -X POST http://localhost:8080/v1/database/create-table \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"database": "testdb",
"schema": "CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT, email TEXT)"
}'
```
3. **Insert Data**
```bash
curl -X POST http://localhost:8080/v1/database/exec \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"database": "testdb",
"sql": "INSERT INTO users (name, email) VALUES (?, ?)",
"args": ["Alice", "alice@example.com"]
}'
```
4. **Query Data**
```bash
curl -X POST http://localhost:8080/v1/database/query \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"database": "testdb",
"sql": "SELECT * FROM users"
}'
```
5. **Test Transaction**
```bash
curl -X POST http://localhost:8080/v1/database/transaction \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"database": "testdb",
"queries": [
"INSERT INTO users (name, email) VALUES (\"Bob\", \"bob@example.com\")",
"INSERT INTO users (name, email) VALUES (\"Charlie\", \"charlie@example.com\")"
]
}'
```
6. **Get Schema**
```bash
curl -X GET "http://localhost:8080/v1/database/schema?database=testdb" \
-H "Authorization: Bearer $TOKEN"
```
7. **Test Hibernation**
- Wait for hibernation timeout
- Query again and measure wake-up time
- Should see delay on first query after hibernation
**Expected Results:**
- All API calls succeed
- Data persists across calls
- Transactions are atomic
- Schema reflects created tables
- Hibernation/wake-up transparent to API
- Response times reasonable (< 30s for queries)
---
## Test Checklist
### Unit Tests (To Implement)
- [ ] Metadata Store operations
- [ ] Metadata Store concurrency
- [ ] Vector Clock increment
- [ ] Vector Clock merge
- [ ] Vector Clock compare
- [ ] Coordinator election (single node)
- [ ] Coordinator election (multiple nodes)
- [ ] Coordinator election (deterministic)
- [ ] Port Manager allocation
- [ ] Port Manager release
- [ ] Port Manager exhaustion
- [ ] Port Manager specific ports
- [ ] RQLite Instance creation
- [ ] RQLite Instance IsIdle
- [ ] Message marshal/unmarshal (all types)
- [ ] Coordinator response collection
- [ ] Coordinator node selection
- [ ] Coordinator registry
### Integration Tests (To Implement)
- [ ] Single node database creation
- [ ] Three node database creation
- [ ] Multiple databases isolation
- [ ] Hibernation cycle
- [ ] Wake-up cycle
- [ ] Node failure detection
- [ ] Node replacement
- [ ] Orphaned data cleanup
- [ ] Concurrent database creation
- [ ] Concurrent hibernation
### Manual Tests (To Perform)
- [ ] Basic three node flow
- [ ] Hibernation and wake-up
- [ ] Multiple databases
- [ ] Node failure and recovery
- [ ] Port exhaustion handling
- [ ] Orphaned data cleanup
- [ ] Stress test with many databases
### Performance Validation
- [ ] Database creation < 10s
- [ ] Wake-up time < 8s
- [ ] Metadata sync < 5s
- [ ] Query overhead < 10ms additional
## Running Tests
### Unit Tests
```bash
# Run all tests
go test ./pkg/rqlite/... -v
# Run with race detector
go test ./pkg/rqlite/... -race
# Run specific test
go test ./pkg/rqlite/ -run TestMetadataStore_GetSetDatabase -v
# Run with coverage
go test ./pkg/rqlite/... -cover -coverprofile=coverage.out
go tool cover -html=coverage.out
```
### Integration Tests
```bash
# Run e2e tests
go test ./e2e/... -v -timeout 30m
# Run specific e2e test
go test ./e2e/ -run TestThreeNodeDatabaseCreation -v
```
### Manual Tests
Follow the scenarios above in dedicated terminals for each node.
## Success Criteria
### Correctness
✅ All unit tests pass
✅ All integration tests pass
✅ All manual scenarios complete successfully
✅ No data loss in any scenario
✅ No race conditions detected
### Performance
✅ Database creation < 10 seconds
✅ Wake-up < 8 seconds
✅ Metadata sync < 5 seconds
✅ Query overhead < 10ms
### Reliability
✅ Survives node failures
✅ Automatic recovery works
✅ No orphaned data accumulates
✅ Hibernation/wake-up cycles stable
✅ Concurrent operations safe
## Notes for Future Test Enhancements
When implementing advanced metrics and benchmarks:
1. **Prometheus Metrics Tests**
- Verify metric export
- Validate metric values
- Test metric reset on restart
2. **Benchmark Suite**
- Automated performance regression detection
- Latency percentile tracking (p50, p95, p99)
- Throughput measurements
- Resource usage profiling
3. **Chaos Engineering**
- Random node kills
- Network partitions
- Clock skew simulation
- Disk full scenarios
4. **Long-Running Stability**
- 24-hour soak test
- Memory leak detection
- Slow-growing resource usage
## Debugging Failed Tests
### Common Issues
**Port Conflicts**
```bash
# Check for processes using test ports
lsof -i :5001-5999
lsof -i :7001-7999
# Kill stale processes
pkill rqlited
```
**Stale Data**
```bash
# Clean test data directories
rm -rf data/test_*/
rm -rf /tmp/debros_test_*/
```
**Timing Issues**
- Increase timeouts in flaky tests
- Add retry logic with exponential backoff
- Use proper synchronization primitives
**Race Conditions**
```bash
# Always run with race detector during development
go test -race ./...
```

View File

@ -31,16 +31,13 @@ func setup_logger(component logging.Component) (logger *logging.ColoredLogger) {
}
// parse_and_return_network_flags it initializes all the network flags coming from the .yaml files
func parse_and_return_network_flags() (configPath *string, dataDir, nodeID *string, p2pPort, rqlHTTP, rqlRaft *int, rqlJoinAddr *string, advAddr *string, help *bool) {
func parse_and_return_network_flags() (configPath *string, dataDir, nodeID *string, p2pPort *int, advAddr *string, help *bool, loadedConfig *config.Config) {
logger := setup_logger(logging.ComponentNode)
configPath = flag.String("config", "", "Path to config YAML file (overrides defaults)")
dataDir = flag.String("data", "", "Data directory (auto-detected if not provided)")
nodeID = flag.String("id", "", "Node identifier (for running multiple local nodes)")
p2pPort = flag.Int("p2p-port", 4001, "LibP2P listen port")
rqlHTTP = flag.Int("rqlite-http-port", 5001, "RQLite HTTP API port")
rqlRaft = flag.Int("rqlite-raft-port", 7001, "RQLite Raft port")
rqlJoinAddr = flag.String("rqlite-join-address", "", "RQLite address to join (e.g., /ip4/)")
advAddr = flag.String("adv-addr", "127.0.0.1", "Default Advertise address for rqlite and rafts")
help = flag.Bool("help", false, "Show help")
flag.Parse()
@ -55,33 +52,18 @@ func parse_and_return_network_flags() (configPath *string, dataDir, nodeID *stri
}
logger.ComponentInfo(logging.ComponentNode, "Configuration loaded from YAML file", zap.String("path", *configPath))
// Instead of returning flag values, return config values
// For ListenAddresses, extract port from multiaddr string if possible, else use default
var p2pPortVal int
if len(cfg.Node.ListenAddresses) > 0 {
// Try to parse port from multiaddr string
var port int
_, err := fmt.Sscanf(cfg.Node.ListenAddresses[0], "/ip4/0.0.0.0/tcp/%d", &port)
if err == nil {
p2pPortVal = port
} else {
p2pPortVal = 4001
}
} else {
p2pPortVal = 4001
}
// Return config values but preserve command line flag values for overrides
// The command line flags will be applied later in load_args_into_config
return configPath,
&cfg.Node.DataDir,
&cfg.Node.ID,
&p2pPortVal,
&cfg.Database.RQLitePort,
&cfg.Database.RQLiteRaftPort,
&cfg.Database.RQLiteJoinAddress,
p2pPort, // Keep the command line flag value
&cfg.Discovery.HttpAdvAddress,
help
help,
cfg // Return the loaded config
}
return
return configPath, dataDir, nodeID, p2pPort, advAddr, help, nil
}
// LoadConfigFromYAML loads a config from a YAML file
@ -109,8 +91,13 @@ func check_if_should_open_help(help *bool) {
func select_data_dir(dataDir *string, nodeID *string) {
logger := setup_logger(logging.ComponentNode)
if *nodeID == "" {
*dataDir = "./data/node"
// If dataDir is not set from config, set it based on nodeID
if *dataDir == "" {
if *nodeID == "" {
*dataDir = "./data/node"
} else {
*dataDir = fmt.Sprintf("./data/%s", *nodeID)
}
}
logger.Info("Successfully selected Data Directory of: %s", zap.String("dataDir", *dataDir))
@ -151,38 +138,30 @@ func startNode(ctx context.Context, cfg *config.Config, port int) error {
}
// load_args_into_config applies command line argument overrides to the config
func load_args_into_config(cfg *config.Config, p2pPort, rqlHTTP, rqlRaft *int, rqlJoinAddr *string, advAddr *string, dataDir *string) {
func load_args_into_config(cfg *config.Config, p2pPort *int, advAddr *string, dataDir *string) {
logger := setup_logger(logging.ComponentNode)
// Apply RQLite HTTP port override
if *rqlHTTP != 5001 {
cfg.Database.RQLitePort = *rqlHTTP
logger.ComponentInfo(logging.ComponentNode, "Overriding RQLite HTTP port", zap.Int("port", *rqlHTTP))
// Apply P2P port override - check if command line port differs from config
var configPort int = 4001 // default
if len(cfg.Node.ListenAddresses) > 0 {
// Try to parse port from multiaddr string in config
_, err := fmt.Sscanf(cfg.Node.ListenAddresses[0], "/ip4/0.0.0.0/tcp/%d", &configPort)
if err != nil {
configPort = 4001 // fallback to default
}
}
// Apply RQLite Raft port override
if *rqlRaft != 7001 {
cfg.Database.RQLiteRaftPort = *rqlRaft
logger.ComponentInfo(logging.ComponentNode, "Overriding RQLite Raft port", zap.Int("port", *rqlRaft))
}
// Apply P2P port override
if *p2pPort != 4001 {
// Override if command line port is different from config port
if *p2pPort != configPort {
cfg.Node.ListenAddresses = []string{
fmt.Sprintf("/ip4/0.0.0.0/tcp/%d", *p2pPort),
}
logger.ComponentInfo(logging.ComponentNode, "Overriding P2P port", zap.Int("port", *p2pPort))
}
// Apply RQLite join address
if *rqlJoinAddr != "" {
cfg.Database.RQLiteJoinAddress = *rqlJoinAddr
logger.ComponentInfo(logging.ComponentNode, "Setting RQLite join address", zap.String("address", *rqlJoinAddr))
}
if *advAddr != "" {
cfg.Discovery.HttpAdvAddress = fmt.Sprintf("%s:%d", *advAddr, *rqlHTTP)
cfg.Discovery.RaftAdvAddress = fmt.Sprintf("%s:%d", *advAddr, *rqlRaft)
cfg.Discovery.HttpAdvAddress = *advAddr
cfg.Discovery.RaftAdvAddress = *advAddr
}
if *dataDir != "" {
@ -193,30 +172,35 @@ func load_args_into_config(cfg *config.Config, p2pPort, rqlHTTP, rqlRaft *int, r
func main() {
logger := setup_logger(logging.ComponentNode)
_, dataDir, nodeID, p2pPort, rqlHTTP, rqlRaft, rqlJoinAddr, advAddr, help := parse_and_return_network_flags()
_, dataDir, nodeID, p2pPort, advAddr, help, loadedConfig := parse_and_return_network_flags()
check_if_should_open_help(help)
select_data_dir(dataDir, nodeID)
// Load Node Configuration
// Load Node Configuration - use loaded config if available, otherwise use default
var cfg *config.Config
cfg = config.DefaultConfig()
logger.ComponentInfo(logging.ComponentNode, "Default configuration loaded successfully")
if loadedConfig != nil {
cfg = loadedConfig
logger.ComponentInfo(logging.ComponentNode, "Using configuration from YAML file")
} else {
cfg = config.DefaultConfig()
logger.ComponentInfo(logging.ComponentNode, "Using default configuration")
}
// Apply command line argument overrides
load_args_into_config(cfg, p2pPort, rqlHTTP, rqlRaft, rqlJoinAddr, advAddr, dataDir)
load_args_into_config(cfg, p2pPort, advAddr, dataDir)
logger.ComponentInfo(logging.ComponentNode, "Command line arguments applied to configuration")
// LibP2P uses configurable port (default 4001); RQLite uses 5001 (HTTP) and 7001 (Raft)
// LibP2P uses configurable port (default 4001)
port := *p2pPort
logger.ComponentInfo(logging.ComponentNode, "Node configuration summary",
zap.Strings("listen_addresses", cfg.Node.ListenAddresses),
zap.Int("rqlite_http_port", cfg.Database.RQLitePort),
zap.Int("rqlite_raft_port", cfg.Database.RQLiteRaftPort),
zap.Int("p2p_port", port),
zap.Strings("bootstrap_peers", cfg.Discovery.BootstrapPeers),
zap.String("rqlite_join_address", cfg.Database.RQLiteJoinAddress),
zap.Int("max_databases", cfg.Database.MaxDatabases),
zap.String("port_range_http", fmt.Sprintf("%d-%d", cfg.Database.PortRangeHTTPStart, cfg.Database.PortRangeHTTPEnd)),
zap.String("port_range_raft", fmt.Sprintf("%d-%d", cfg.Database.PortRangeRaftStart, cfg.Database.PortRangeRaftEnd)),
zap.String("data_directory", *dataDir))
// Create context for graceful shutdown

View File

@ -1,175 +1,39 @@
package client
import (
"os"
"strconv"
"strings"
"fmt"
"time"
"github.com/DeBrosOfficial/network/pkg/config"
"github.com/multiformats/go-multiaddr"
)
// DefaultBootstrapPeers returns the library's default bootstrap peer multiaddrs.
// These can be overridden by environment variables or config.
func DefaultBootstrapPeers() []string {
// DefaultClientConfig returns a default client configuration
func DefaultClientConfig(appName string) *ClientConfig {
defaultCfg := config.DefaultConfig()
return defaultCfg.Discovery.BootstrapPeers
}
// DefaultDatabaseEndpoints returns default DB HTTP endpoints.
// These can be overridden by environment variables or config.
func DefaultDatabaseEndpoints() []string {
// Check environment variable first
if envNodes := os.Getenv("RQLITE_NODES"); envNodes != "" {
return normalizeEndpoints(splitCSVOrSpace(envNodes))
}
// Get default port from environment or use port from config
defaultCfg := config.DefaultConfig()
port := defaultCfg.Database.RQLitePort
if envPort := os.Getenv("RQLITE_PORT"); envPort != "" {
if p, err := strconv.Atoi(envPort); err == nil && p > 0 {
port = p
}
}
// Try to derive from bootstrap peers if available
peers := DefaultBootstrapPeers()
if len(peers) > 0 {
endpoints := make([]string, 0, len(peers))
for _, s := range peers {
ma, err := multiaddr.NewMultiaddr(s)
if err != nil {
continue
}
endpoints = append(endpoints, endpointFromMultiaddr(ma, port))
}
return dedupeStrings(endpoints)
}
// Fallback to localhost
return []string{"http://localhost:" + strconv.Itoa(port)}
}
// MapAddrsToDBEndpoints converts a set of peer multiaddrs to DB HTTP endpoints using dbPort.
func MapAddrsToDBEndpoints(addrs []multiaddr.Multiaddr, dbPort int) []string {
if dbPort <= 0 {
dbPort = 5001
}
eps := make([]string, 0, len(addrs))
for _, ma := range addrs {
eps = append(eps, endpointFromMultiaddr(ma, dbPort))
}
return dedupeStrings(eps)
}
// endpointFromMultiaddr extracts host from multiaddr and creates HTTP endpoint
func endpointFromMultiaddr(ma multiaddr.Multiaddr, port int) string {
var host string
// Prefer DNS if present, then IP
if v, err := ma.ValueForProtocol(multiaddr.P_DNS); err == nil && v != "" {
host = v
}
if host == "" {
if v, err := ma.ValueForProtocol(multiaddr.P_DNS4); err == nil && v != "" {
host = v
}
}
if host == "" {
if v, err := ma.ValueForProtocol(multiaddr.P_DNS6); err == nil && v != "" {
host = v
}
}
if host == "" {
if v, err := ma.ValueForProtocol(multiaddr.P_IP4); err == nil && v != "" {
host = v
}
}
if host == "" {
if v, err := ma.ValueForProtocol(multiaddr.P_IP6); err == nil && v != "" {
host = "[" + v + "]" // IPv6 needs brackets in URLs
}
}
if host == "" {
host = "localhost"
}
return "http://" + host + ":" + strconv.Itoa(port)
}
// normalizeEndpoints ensures each endpoint has an http scheme and a port (defaults to 5001)
func normalizeEndpoints(in []string) []string {
out := make([]string, 0, len(in))
for _, s := range in {
s = strings.TrimSpace(s)
if s == "" {
continue
}
// Prepend scheme if missing
if !strings.HasPrefix(s, "http://") && !strings.HasPrefix(s, "https://") {
s = "http://" + s
}
// Simple check for port (doesn't handle all cases but good enough)
if !strings.Contains(s, ":5001") && !strings.Contains(s, ":500") && !strings.Contains(s, ":501") {
// Check if there's already a port after the host
parts := strings.Split(s, "://")
if len(parts) == 2 {
hostPart := parts[1]
// Count colons to detect port (simple heuristic)
colonCount := strings.Count(hostPart, ":")
if colonCount == 0 || (strings.Contains(hostPart, "[") && colonCount == 1) {
// No port found, add default
s = s + ":5001"
}
}
}
out = append(out, s)
}
return out
}
// dedupeStrings removes duplicate strings from slice
func dedupeStrings(in []string) []string {
if len(in) == 0 {
return in
}
seen := make(map[string]struct{}, len(in))
out := make([]string, 0, len(in))
for _, s := range in {
s = strings.TrimSpace(s)
if s == "" {
continue
}
if _, ok := seen[s]; ok {
continue
}
seen[s] = struct{}{}
out = append(out, s)
}
return out
}
// splitCSVOrSpace splits a string by commas or spaces
func splitCSVOrSpace(s string) []string {
// Replace commas with spaces, then split on spaces
s = strings.ReplaceAll(s, ",", " ")
fields := strings.Fields(s)
return fields
}
// truthy reports if s is a common truthy string
func truthy(s string) bool {
switch strings.ToLower(strings.TrimSpace(s)) {
case "1", "true", "yes", "on":
return true
default:
return false
return &ClientConfig{
AppName: appName,
DatabaseName: fmt.Sprintf("%s_db", appName),
BootstrapPeers: defaultCfg.Discovery.BootstrapPeers,
DatabaseEndpoints: []string{},
ConnectTimeout: 30 * time.Second,
RetryAttempts: 3,
RetryDelay: 5 * time.Second,
QuietMode: false,
APIKey: "",
JWT: "",
}
}
// ValidateClientConfig validates a client configuration
func ValidateClientConfig(cfg *ClientConfig) error {
if len(cfg.BootstrapPeers) == 0 {
return fmt.Errorf("at least one bootstrap peer is required")
}
if cfg.AppName == "" {
return fmt.Errorf("app name is required")
}
return nil
}

View File

@ -1,52 +0,0 @@
package client
import (
"os"
"testing"
"github.com/multiformats/go-multiaddr"
)
func TestDefaultBootstrapPeersNonEmpty(t *testing.T) {
old := os.Getenv("DEBROS_BOOTSTRAP_PEERS")
t.Cleanup(func() { os.Setenv("DEBROS_BOOTSTRAP_PEERS", old) })
_ = os.Setenv("DEBROS_BOOTSTRAP_PEERS", "") // ensure not set
peers := DefaultBootstrapPeers()
if len(peers) == 0 {
t.Fatalf("expected non-empty default bootstrap peers")
}
}
func TestDefaultDatabaseEndpointsEnvOverride(t *testing.T) {
oldNodes := os.Getenv("RQLITE_NODES")
t.Cleanup(func() { os.Setenv("RQLITE_NODES", oldNodes) })
_ = os.Setenv("RQLITE_NODES", "db1.local:7001, https://db2.local:7443")
endpoints := DefaultDatabaseEndpoints()
if len(endpoints) != 2 {
t.Fatalf("expected 2 endpoints from env, got %v", endpoints)
}
}
func TestNormalizeEndpoints(t *testing.T) {
in := []string{"db.local", "http://db.local:5001", "[::1]", "https://host:8443"}
out := normalizeEndpoints(in)
if len(out) != 4 {
t.Fatalf("unexpected len: %v", out)
}
foundDefault := false
for _, s := range out {
if s == "http://db.local:5001" {
foundDefault = true
}
}
if !foundDefault {
t.Fatalf("missing normalized default port: %v", out)
}
}
func TestEndpointFromMultiaddr(t *testing.T) {
ma, _ := multiaddr.NewMultiaddr("/ip4/127.0.0.1/tcp/4001")
if ep := endpointFromMultiaddr(ma, 5001); ep != "http://127.0.0.1:5001" {
t.Fatalf("unexpected endpoint: %s", ep)
}
}

View File

@ -6,6 +6,7 @@ import (
"strings"
"sync"
"time"
"unicode"
"github.com/libp2p/go-libp2p/core/peer"
"github.com/multiformats/go-multiaddr"
@ -14,9 +15,10 @@ import (
// DatabaseClientImpl implements DatabaseClient
type DatabaseClientImpl struct {
client *Client
connection *gorqlite.Connection
mu sync.RWMutex
client *Client
connection *gorqlite.Connection
databaseName string // Empty for default database, or specific database name
mu sync.RWMutex
}
// checkConnection verifies the client is connected
@ -176,19 +178,17 @@ func (d *DatabaseClientImpl) getRQLiteConnection() (*gorqlite.Connection, error)
// getRQLiteNodes returns a list of RQLite node URLs with precedence:
// 1) client config DatabaseEndpoints
// 2) RQLITE_NODES env (comma/space separated)
// 3) library defaults via DefaultDatabaseEndpoints()
// 3) library defaults via bootstrap peers
func (d *DatabaseClientImpl) getRQLiteNodes() []string {
// 1) Prefer explicit configuration on the client
if d.client != nil && d.client.config != nil && len(d.client.config.DatabaseEndpoints) > 0 {
return dedupeStrings(normalizeEndpoints(d.client.config.DatabaseEndpoints))
return d.client.config.DatabaseEndpoints
}
// 3) Fallback to library defaults derived from bootstrap peers
return DefaultDatabaseEndpoints()
// 2) Return empty - dynamic clustering will determine endpoints
return []string{}
}
// normalizeEndpoints is now imported from defaults.go
func hasPort(hostport string) bool {
// cheap check for :port suffix (IPv6 with brackets handled by url.Parse earlier)
if i := strings.LastIndex(hostport, ":"); i > -1 && i < len(hostport)-1 {
@ -392,6 +392,46 @@ func (d *DatabaseClientImpl) GetSchema(ctx context.Context) (*SchemaInfo, error)
return schema, nil
}
// Database returns a database client for the named database
// The database name is prefixed with the app name for isolation
func (d *DatabaseClientImpl) Database(name string) (DatabaseClient, error) {
if !d.client.isConnected() {
return nil, fmt.Errorf("client not connected")
}
// Sanitize and prefix database name
appName := d.client.getAppNamespace()
fullDBName := sanitizeDatabaseName(appName, name)
// Create a new database client instance for this specific database
dbClient := &DatabaseClientImpl{
client: d.client,
databaseName: fullDBName,
}
return dbClient, nil
}
// sanitizeDatabaseName creates a sanitized database name with app prefix
func sanitizeDatabaseName(appName, dbName string) string {
sanitizedApp := sanitizeIdentifier(appName)
sanitizedDB := sanitizeIdentifier(dbName)
return fmt.Sprintf("%s_%s", sanitizedApp, sanitizedDB)
}
// sanitizeIdentifier sanitizes an identifier (app or database name)
func sanitizeIdentifier(name string) string {
var result strings.Builder
for _, r := range name {
if unicode.IsLetter(r) || unicode.IsNumber(r) || r == '_' {
result.WriteRune(unicode.ToLower(r))
} else if r == '-' || r == ' ' {
result.WriteRune('_')
}
}
return result.String()
}
// NetworkInfoImpl implements NetworkInfo
type NetworkInfoImpl struct {
client *Client

View File

@ -2,7 +2,6 @@ package client
import (
"context"
"fmt"
"time"
)
@ -33,6 +32,11 @@ type DatabaseClient interface {
CreateTable(ctx context.Context, schema string) error
DropTable(ctx context.Context, tableName string) error
GetSchema(ctx context.Context) (*SchemaInfo, error)
// Multi-database support (NEW)
// Database returns a database client for the named database
// The database name will be prefixed with the app name for isolation
Database(name string) (DatabaseClient, error)
}
// PubSubClient provides publish/subscribe messaging
@ -120,23 +124,3 @@ type ClientConfig struct {
APIKey string `json:"api_key"` // API key for gateway auth
JWT string `json:"jwt"` // Optional JWT bearer token
}
// DefaultClientConfig returns a default client configuration
func DefaultClientConfig(appName string) *ClientConfig {
// Base defaults
peers := DefaultBootstrapPeers()
endpoints := DefaultDatabaseEndpoints()
return &ClientConfig{
AppName: appName,
DatabaseName: fmt.Sprintf("%s_db", appName),
BootstrapPeers: peers,
DatabaseEndpoints: endpoints,
ConnectTimeout: time.Second * 30,
RetryAttempts: 3,
RetryDelay: time.Second * 5,
QuietMode: false,
APIKey: "",
JWT: "",
}
}

View File

@ -26,26 +26,29 @@ type NodeConfig struct {
// DatabaseConfig contains database-related configuration
type DatabaseConfig struct {
DataDir string `yaml:"data_dir"`
ReplicationFactor int `yaml:"replication_factor"`
ShardCount int `yaml:"shard_count"`
MaxDatabaseSize int64 `yaml:"max_database_size"` // In bytes
BackupInterval time.Duration `yaml:"backup_interval"`
// RQLite-specific configuration
RQLitePort int `yaml:"rqlite_port"` // RQLite HTTP API port
RQLiteRaftPort int `yaml:"rqlite_raft_port"` // RQLite Raft consensus port
RQLiteJoinAddress string `yaml:"rqlite_join_address"` // Address to join RQLite cluster
// Dynamic database clustering
HibernationTimeout time.Duration `yaml:"hibernation_timeout"` // Seconds before hibernation
MaxDatabases int `yaml:"max_databases"` // Max databases per node
PortRangeHTTPStart int `yaml:"port_range_http_start"` // HTTP port range start
PortRangeHTTPEnd int `yaml:"port_range_http_end"` // HTTP port range end
PortRangeRaftStart int `yaml:"port_range_raft_start"` // Raft port range start
PortRangeRaftEnd int `yaml:"port_range_raft_end"` // Raft port range end
}
// DiscoveryConfig contains peer discovery configuration
type DiscoveryConfig struct {
BootstrapPeers []string `yaml:"bootstrap_peers"` // Bootstrap peer addresses
DiscoveryInterval time.Duration `yaml:"discovery_interval"` // Discovery announcement interval
BootstrapPort int `yaml:"bootstrap_port"` // Default port for bootstrap nodes
HttpAdvAddress string `yaml:"http_adv_address"` // HTTP advertisement address
RaftAdvAddress string `yaml:"raft_adv_address"` // Raft advertisement
NodeNamespace string `yaml:"node_namespace"` // Namespace for node identifiers
BootstrapPeers []string `yaml:"bootstrap_peers"` // Bootstrap peer addresses
DiscoveryInterval time.Duration `yaml:"discovery_interval"` // Discovery announcement interval
BootstrapPort int `yaml:"bootstrap_port"` // Default port for bootstrap nodes
HttpAdvAddress string `yaml:"http_adv_address"` // HTTP advertisement address
RaftAdvAddress string `yaml:"raft_adv_address"` // Raft advertisement
NodeNamespace string `yaml:"node_namespace"` // Namespace for node identifiers
HealthCheckInterval time.Duration `yaml:"health_check_interval"` // Health check interval for node monitoring
}
// SecurityConfig contains security-related configuration
@ -96,30 +99,34 @@ func DefaultConfig() *Config {
MaxConnections: 50,
},
Database: DatabaseConfig{
DataDir: "./data/db",
ReplicationFactor: 3,
ShardCount: 16,
MaxDatabaseSize: 1024 * 1024 * 1024, // 1GB
BackupInterval: time.Hour * 24, // Daily backups
// RQLite-specific configuration
RQLitePort: 5001,
RQLiteRaftPort: 7001,
RQLiteJoinAddress: "", // Empty for bootstrap node
// Dynamic database clustering
HibernationTimeout: 60 * time.Second,
MaxDatabases: 100,
PortRangeHTTPStart: 5001,
PortRangeHTTPEnd: 5999,
PortRangeRaftStart: 7001,
PortRangeRaftEnd: 7999,
},
Discovery: DiscoveryConfig{
BootstrapPeers: []string{
"/ip4/217.76.54.168/tcp/4001/p2p/12D3KooWDp7xeShVY9uHfqNVPSsJeCKUatAviFZV8Y1joox5nUvx",
"/ip4/217.76.54.178/tcp/4001/p2p/12D3KooWKZnirPwNT4URtNSWK45f6vLkEs4xyUZ792F8Uj1oYnm1",
"/ip4/51.83.128.181/tcp/4001/p2p/12D3KooWBn2Zf1R8v9pEfmz7hDZ5b3oADxfejA3zJBYzKRCzgvhR",
"/ip4/155.133.27.199/tcp/4001/p2p/12D3KooWC69SBzM5QUgrLrfLWUykE8au32X5LwT7zwv9bixrQPm1",
"/ip4/217.76.56.2/tcp/4001/p2p/12D3KooWEiqJHvznxqJ5p2y8mUs6Ky6dfU1xTYFQbyKRCABfcZz4",
"/ip4/127.0.0.1/tcp/4001/p2p/12D3KooWKdj4B3LdZ8whYGaa97giwWCoSELciRp6qsFrDvz2Etah",
// "/ip4/217.76.54.168/tcp/4001/p2p/12D3KooWDp7xeShVY9uHfqNVPSsJeCKUatAviFZV8Y1joox5nUvx",
// "/ip4/217.76.54.178/tcp/4001/p2p/12D3KooWKZnirPwNT4URtNSWK45f6vLkEs4xyUZ792F8Uj1oYnm1",
// "/ip4/51.83.128.181/tcp/4001/p2p/12D3KooWBn2Zf1R8v9pEfmz7hDZ5b3oADxfejA3zJBYzKRCzgvhR",
// "/ip4/155.133.27.199/tcp/4001/p2p/12D3KooWC69SBzM5QUgrLrfLWUykE8au32X5LwT7zwv9bixrQPm1",
// "/ip4/217.76.56.2/tcp/4001/p2p/12D3KooWEiqJHvznxqJ5p2y8mUs6Ky6dfU1xTYFQbyKRCABfcZz4",
},
BootstrapPort: 4001, // Default LibP2P port
DiscoveryInterval: time.Second * 15, // Back to 15 seconds for testing
HttpAdvAddress: "",
RaftAdvAddress: "",
NodeNamespace: "default",
BootstrapPort: 4001, // Default LibP2P port
DiscoveryInterval: time.Second * 15, // Back to 15 seconds for testing
HttpAdvAddress: "",
RaftAdvAddress: "",
NodeNamespace: "default",
HealthCheckInterval: 10 * time.Second, // Health check interval
},
Security: SecurityConfig{
EnableTLS: false,

View File

@ -0,0 +1,449 @@
package gateway
import (
"context"
"encoding/json"
"fmt"
"net/http"
"strings"
"time"
"github.com/DeBrosOfficial/network/pkg/logging"
"go.uber.org/zap"
)
// Database request/response types
type ExecRequest struct {
Database string `json:"database"`
SQL string `json:"sql"`
Args []interface{} `json:"args,omitempty"`
}
type ExecResponse struct {
RowsAffected int64 `json:"rows_affected"`
LastInsertID int64 `json:"last_insert_id,omitempty"`
Error string `json:"error,omitempty"`
}
type QueryRequest struct {
Database string `json:"database"`
SQL string `json:"sql"`
Args []interface{} `json:"args,omitempty"`
}
type QueryResponse struct {
Items []map[string]interface{} `json:"items"`
Count int `json:"count"`
Error string `json:"error,omitempty"`
}
type TransactionRequest struct {
Database string `json:"database"`
Queries []string `json:"queries"`
}
type TransactionResponse struct {
Success bool `json:"success"`
Error string `json:"error,omitempty"`
}
type CreateTableRequest struct {
Database string `json:"database"`
Schema string `json:"schema"`
}
type DropTableRequest struct {
Database string `json:"database"`
TableName string `json:"table_name"`
}
type SchemaResponse struct {
Tables []TableSchema `json:"tables"`
Error string `json:"error,omitempty"`
}
type TableSchema struct {
Name string `json:"name"`
CreateSQL string `json:"create_sql"`
Columns []string `json:"columns,omitempty"`
}
// Database handlers
// databaseExecHandler handles SQL execution (INSERT, UPDATE, DELETE, DDL)
func (g *Gateway) databaseExecHandler(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
var req ExecRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "Invalid request body"})
return
}
if req.Database == "" {
g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "database field is required"})
return
}
if req.SQL == "" {
g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "sql field is required"})
return
}
// Get database client
db, err := g.client.Database().Database(req.Database)
if err != nil {
g.logger.ComponentError(logging.ComponentDatabase, "Failed to get database client", zap.Error(err))
g.respondJSON(w, http.StatusInternalServerError, ExecResponse{Error: fmt.Sprintf("Failed to access database: %v", err)})
return
}
// Execute with timeout
ctx, cancel := context.WithTimeout(r.Context(), 30*time.Second)
defer cancel()
// For simplicity, we'll use Query and check if it's a write operation
// In production, you'd want to detect write vs read and route accordingly
result, err := db.Query(ctx, req.SQL, req.Args...)
if err != nil {
g.logger.ComponentError(logging.ComponentDatabase, "Query execution failed",
zap.String("database", req.Database),
zap.Error(err))
g.respondJSON(w, http.StatusInternalServerError, ExecResponse{Error: err.Error()})
return
}
// For exec operations, return affected rows
g.respondJSON(w, http.StatusOK, ExecResponse{
RowsAffected: result.Count,
})
}
// databaseQueryHandler handles SELECT queries
func (g *Gateway) databaseQueryHandler(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
var req QueryRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
g.respondJSON(w, http.StatusBadRequest, QueryResponse{Error: "Invalid request body"})
return
}
if req.Database == "" {
g.respondJSON(w, http.StatusBadRequest, QueryResponse{Error: "database field is required"})
return
}
if req.SQL == "" {
g.respondJSON(w, http.StatusBadRequest, QueryResponse{Error: "sql field is required"})
return
}
// Get database client
db, err := g.client.Database().Database(req.Database)
if err != nil {
g.logger.ComponentError(logging.ComponentDatabase, "Failed to get database client", zap.Error(err))
g.respondJSON(w, http.StatusInternalServerError, QueryResponse{Error: fmt.Sprintf("Failed to access database: %v", err)})
return
}
// Execute with timeout
ctx, cancel := context.WithTimeout(r.Context(), 30*time.Second)
defer cancel()
result, err := db.Query(ctx, req.SQL, req.Args...)
if err != nil {
g.logger.ComponentError(logging.ComponentDatabase, "Query execution failed",
zap.String("database", req.Database),
zap.Error(err))
g.respondJSON(w, http.StatusInternalServerError, QueryResponse{Error: err.Error()})
return
}
// Convert result to map format
items := make([]map[string]interface{}, len(result.Rows))
for i, row := range result.Rows {
item := make(map[string]interface{})
for j, col := range result.Columns {
if j < len(row) {
item[col] = row[j]
}
}
items[i] = item
}
g.respondJSON(w, http.StatusOK, QueryResponse{
Items: items,
Count: len(items),
})
}
// databaseTransactionHandler handles atomic transactions
func (g *Gateway) databaseTransactionHandler(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
var req TransactionRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
g.respondJSON(w, http.StatusBadRequest, TransactionResponse{Success: false, Error: "Invalid request body"})
return
}
if req.Database == "" {
g.respondJSON(w, http.StatusBadRequest, TransactionResponse{Success: false, Error: "database field is required"})
return
}
if len(req.Queries) == 0 {
g.respondJSON(w, http.StatusBadRequest, TransactionResponse{Success: false, Error: "queries field is required and must not be empty"})
return
}
// Get database client
db, err := g.client.Database().Database(req.Database)
if err != nil {
g.logger.ComponentError(logging.ComponentDatabase, "Failed to get database client", zap.Error(err))
g.respondJSON(w, http.StatusInternalServerError, TransactionResponse{Success: false, Error: fmt.Sprintf("Failed to access database: %v", err)})
return
}
// Execute with timeout
ctx, cancel := context.WithTimeout(r.Context(), 60*time.Second)
defer cancel()
err = db.Transaction(ctx, req.Queries)
if err != nil {
g.logger.ComponentError(logging.ComponentDatabase, "Transaction failed",
zap.String("database", req.Database),
zap.Error(err))
g.respondJSON(w, http.StatusInternalServerError, TransactionResponse{Success: false, Error: err.Error()})
return
}
g.respondJSON(w, http.StatusOK, TransactionResponse{Success: true})
}
// databaseSchemaHandler returns database schema information
func (g *Gateway) databaseSchemaHandler(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet && r.Method != http.MethodPost {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
// Support both GET with query param and POST with JSON body
var database string
if r.Method == http.MethodPost {
var req struct {
Database string `json:"database"`
}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
g.respondJSON(w, http.StatusBadRequest, SchemaResponse{Error: "Invalid request body"})
return
}
database = req.Database
} else {
database = r.URL.Query().Get("database")
}
if database == "" {
g.respondJSON(w, http.StatusBadRequest, SchemaResponse{Error: "database parameter is required"})
return
}
// Get database client
db, err := g.client.Database().Database(database)
if err != nil {
g.logger.ComponentError(logging.ComponentDatabase, "Failed to get database client", zap.Error(err))
g.respondJSON(w, http.StatusInternalServerError, SchemaResponse{Error: fmt.Sprintf("Failed to access database: %v", err)})
return
}
// Execute with timeout
ctx, cancel := context.WithTimeout(r.Context(), 30*time.Second)
defer cancel()
schemaInfo, err := db.GetSchema(ctx)
if err != nil {
g.logger.ComponentError(logging.ComponentDatabase, "Failed to get schema",
zap.String("database", database),
zap.Error(err))
g.respondJSON(w, http.StatusInternalServerError, SchemaResponse{Error: err.Error()})
return
}
// Convert to response format
tables := make([]TableSchema, len(schemaInfo.Tables))
for i, table := range schemaInfo.Tables {
columns := make([]string, len(table.Columns))
for j, col := range table.Columns {
columns[j] = col.Name
}
tables[i] = TableSchema{
Name: table.Name,
Columns: columns,
}
}
g.respondJSON(w, http.StatusOK, SchemaResponse{Tables: tables})
}
// databaseCreateTableHandler creates a new table
func (g *Gateway) databaseCreateTableHandler(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
var req CreateTableRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "Invalid request body"})
return
}
if req.Database == "" {
g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "database field is required"})
return
}
if req.Schema == "" {
g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "schema field is required"})
return
}
// Get database client
db, err := g.client.Database().Database(req.Database)
if err != nil {
g.logger.ComponentError(logging.ComponentDatabase, "Failed to get database client", zap.Error(err))
g.respondJSON(w, http.StatusInternalServerError, ExecResponse{Error: fmt.Sprintf("Failed to access database: %v", err)})
return
}
// Execute with timeout
ctx, cancel := context.WithTimeout(r.Context(), 30*time.Second)
defer cancel()
err = db.CreateTable(ctx, req.Schema)
if err != nil {
g.logger.ComponentError(logging.ComponentDatabase, "Failed to create table",
zap.String("database", req.Database),
zap.Error(err))
g.respondJSON(w, http.StatusInternalServerError, ExecResponse{Error: err.Error()})
return
}
g.respondJSON(w, http.StatusOK, ExecResponse{RowsAffected: 0})
}
// databaseDropTableHandler drops a table
func (g *Gateway) databaseDropTableHandler(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
var req DropTableRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "Invalid request body"})
return
}
if req.Database == "" {
g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "database field is required"})
return
}
if req.TableName == "" {
g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "table_name field is required"})
return
}
// Validate table name (basic SQL injection prevention)
if !isValidIdentifier(req.TableName) {
g.respondJSON(w, http.StatusBadRequest, ExecResponse{Error: "invalid table name"})
return
}
// Get database client
db, err := g.client.Database().Database(req.Database)
if err != nil {
g.logger.ComponentError(logging.ComponentDatabase, "Failed to get database client", zap.Error(err))
g.respondJSON(w, http.StatusInternalServerError, ExecResponse{Error: fmt.Sprintf("Failed to access database: %v", err)})
return
}
// Execute with timeout
ctx, cancel := context.WithTimeout(r.Context(), 30*time.Second)
defer cancel()
err = db.DropTable(ctx, req.TableName)
if err != nil {
g.logger.ComponentError(logging.ComponentDatabase, "Failed to drop table",
zap.String("database", req.Database),
zap.String("table", req.TableName),
zap.Error(err))
g.respondJSON(w, http.StatusInternalServerError, ExecResponse{Error: err.Error()})
return
}
g.respondJSON(w, http.StatusOK, ExecResponse{RowsAffected: 0})
}
// databaseListHandler lists all available databases for the current app
func (g *Gateway) databaseListHandler(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodGet {
http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
return
}
// TODO: This would require the ClusterManager to expose a list of databases
// For now, return a placeholder
g.respondJSON(w, http.StatusOK, map[string]interface{}{
"databases": []string{},
"message": "Database listing not yet implemented - query metadata store directly",
})
}
// Helper functions
func (g *Gateway) respondJSON(w http.ResponseWriter, status int, data interface{}) {
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(status)
if err := json.NewEncoder(w).Encode(data); err != nil {
g.logger.ComponentError(logging.ComponentGeneral, "Failed to encode JSON response", zap.Error(err))
}
}
func isValidIdentifier(name string) bool {
if len(name) == 0 || len(name) > 128 {
return false
}
// Only allow alphanumeric, underscore, and hyphen
for _, r := range name {
if !(r >= 'a' && r <= 'z') && !(r >= 'A' && r <= 'Z') && !(r >= '0' && r <= '9') && r != '_' && r != '-' {
return false
}
}
// Don't start with number
firstRune := []rune(name)[0]
if firstRune >= '0' && firstRune <= '9' {
return false
}
// Avoid SQL keywords
upperName := strings.ToUpper(name)
sqlKeywords := []string{"SELECT", "INSERT", "UPDATE", "DELETE", "DROP", "CREATE", "ALTER", "TABLE", "DATABASE", "INDEX"}
for _, keyword := range sqlKeywords {
if upperName == keyword {
return false
}
}
return true
}

View File

@ -4,16 +4,12 @@ import (
"context"
"crypto/rand"
"crypto/rsa"
"database/sql"
"strconv"
"time"
"github.com/DeBrosOfficial/network/pkg/client"
"github.com/DeBrosOfficial/network/pkg/logging"
"github.com/DeBrosOfficial/network/pkg/rqlite"
"go.uber.org/zap"
_ "github.com/rqlite/gorqlite/stdlib"
)
// Config holds configuration for the gateway server
@ -34,11 +30,6 @@ type Gateway struct {
startedAt time.Time
signingKey *rsa.PrivateKey
keyID string
// rqlite SQL connection and HTTP ORM gateway
sqlDB *sql.DB
ormClient rqlite.Client
ormHTTP *rqlite.HTTPGateway
}
// New creates and initializes a new Gateway instance
@ -87,24 +78,7 @@ func New(logger *logging.ColoredLogger, cfg *Config) (*Gateway, error) {
logger.ComponentWarn(logging.ComponentGeneral, "failed to generate RSA key; jwks will be empty", zap.Error(err))
}
logger.ComponentInfo(logging.ComponentGeneral, "Initializing RQLite ORM HTTP gateway...")
dsn := cfg.RQLiteDSN
if dsn == "" {
dsn = "http://localhost:4001"
}
db, dbErr := sql.Open("rqlite", dsn)
if dbErr != nil {
logger.ComponentWarn(logging.ComponentGeneral, "failed to open rqlite sql db; http orm gateway disabled", zap.Error(dbErr))
} else {
gw.sqlDB = db
orm := rqlite.NewClient(db)
gw.ormClient = orm
gw.ormHTTP = rqlite.NewHTTPGateway(orm, "/v1/db")
logger.ComponentInfo(logging.ComponentGeneral, "RQLite ORM HTTP gateway ready",
zap.String("dsn", dsn),
zap.String("base_path", "/v1/db"),
)
}
logger.ComponentInfo(logging.ComponentGeneral, "Gateway initialized with dynamic database clustering")
logger.ComponentInfo(logging.ComponentGeneral, "Gateway creation completed, returning...")
return gw, nil
@ -122,7 +96,5 @@ func (g *Gateway) Close() {
g.logger.ComponentWarn(logging.ComponentClient, "error during client disconnect", zap.Error(err))
}
}
if g.sqlDB != nil {
_ = g.sqlDB.Close()
}
// No legacy database connections to close
}

View File

@ -27,12 +27,6 @@ func (g *Gateway) Routes() http.Handler {
mux.HandleFunc("/v1/auth/logout", g.logoutHandler)
mux.HandleFunc("/v1/auth/whoami", g.whoamiHandler)
// rqlite ORM HTTP gateway (mounts /v1/rqlite/* endpoints)
if g.ormHTTP != nil {
g.ormHTTP.BasePath = "/v1/rqlite"
g.ormHTTP.RegisterRoutes(mux)
}
// network
mux.HandleFunc("/v1/network/status", g.networkStatusHandler)
mux.HandleFunc("/v1/network/peers", g.networkPeersHandler)
@ -44,5 +38,14 @@ func (g *Gateway) Routes() http.Handler {
mux.HandleFunc("/v1/pubsub/publish", g.pubsubPublishHandler)
mux.HandleFunc("/v1/pubsub/topics", g.pubsubTopicsHandler)
// database operations (dynamic clustering)
mux.HandleFunc("/v1/database/exec", g.databaseExecHandler)
mux.HandleFunc("/v1/database/query", g.databaseQueryHandler)
mux.HandleFunc("/v1/database/transaction", g.databaseTransactionHandler)
mux.HandleFunc("/v1/database/schema", g.databaseSchemaHandler)
mux.HandleFunc("/v1/database/create-table", g.databaseCreateTableHandler)
mux.HandleFunc("/v1/database/drop-table", g.databaseDropTableHandler)
mux.HandleFunc("/v1/database/list", g.databaseListHandler)
return g.withMiddleware(mux)
}

View File

@ -34,8 +34,8 @@ type Node struct {
logger *logging.ColoredLogger
host host.Host
rqliteManager *database.RQLiteManager
rqliteAdapter *database.RQLiteAdapter
// Dynamic database clustering
clusterManager *database.ClusterManager
// Peer discovery
discoveryCancel context.CancelFunc
@ -59,25 +59,26 @@ func NewNode(cfg *config.Config) (*Node, error) {
}, nil
}
// startRQLite initializes and starts the RQLite database
func (n *Node) startRQLite(ctx context.Context) error {
n.logger.Info("Starting RQLite database")
// startClusterManager initializes and starts the cluster manager for dynamic databases
func (n *Node) startClusterManager(ctx context.Context) error {
n.logger.Info("Starting dynamic database cluster manager")
// Create RQLite manager
n.rqliteManager = database.NewRQLiteManager(&n.config.Database, &n.config.Discovery, n.config.Node.DataDir, n.logger.Logger)
// Create cluster manager
n.clusterManager = database.NewClusterManager(
n.host.ID().String(),
&n.config.Database,
&n.config.Discovery,
n.config.Node.DataDir,
n.pubsub,
n.logger.Logger,
)
// Start RQLite
if err := n.rqliteManager.Start(ctx); err != nil {
return err
// Start cluster manager
if err := n.clusterManager.Start(); err != nil {
return fmt.Errorf("failed to start cluster manager: %w", err)
}
// Create adapter for sql.DB compatibility
adapter, err := database.NewRQLiteAdapter(n.rqliteManager)
if err != nil {
return fmt.Errorf("failed to create RQLite adapter: %w", err)
}
n.rqliteAdapter = adapter
n.logger.Info("Dynamic database cluster manager started successfully")
return nil
}
@ -563,19 +564,18 @@ func (n *Node) Stop() error {
// Stop peer discovery
n.stopPeerDiscovery()
// Stop cluster manager
if n.clusterManager != nil {
if err := n.clusterManager.Stop(); err != nil {
n.logger.ComponentWarn(logging.ComponentNode, "Error stopping cluster manager", zap.Error(err))
}
}
// Stop LibP2P host
if n.host != nil {
n.host.Close()
}
// Stop RQLite
if n.rqliteAdapter != nil {
n.rqliteAdapter.Close()
}
if n.rqliteManager != nil {
_ = n.rqliteManager.Stop()
}
n.logger.ComponentInfo(logging.ComponentNode, "Network node stopped")
return nil
}
@ -589,16 +589,16 @@ func (n *Node) Start(ctx context.Context) error {
return fmt.Errorf("failed to create data directory: %w", err)
}
// Start RQLite
if err := n.startRQLite(ctx); err != nil {
return fmt.Errorf("failed to start RQLite: %w", err)
}
// Start LibP2P host
// Start LibP2P host (required before cluster manager)
if err := n.startLibP2P(); err != nil {
return fmt.Errorf("failed to start LibP2P: %w", err)
}
// Start cluster manager for dynamic databases
if err := n.startClusterManager(ctx); err != nil {
return fmt.Errorf("failed to start cluster manager: %w", err)
}
// Get listen addresses for logging
var listenAddrs []string
for _, addr := range n.host.Addrs() {

View File

@ -1,46 +0,0 @@
package rqlite
import (
"database/sql"
"fmt"
_ "github.com/rqlite/gorqlite/stdlib" // Import the database/sql driver
)
// RQLiteAdapter adapts RQLite to the sql.DB interface
type RQLiteAdapter struct {
manager *RQLiteManager
db *sql.DB
}
// NewRQLiteAdapter creates a new adapter that provides sql.DB interface for RQLite
func NewRQLiteAdapter(manager *RQLiteManager) (*RQLiteAdapter, error) {
// Use the gorqlite database/sql driver
db, err := sql.Open("rqlite", fmt.Sprintf("http://localhost:%d", manager.config.RQLitePort))
if err != nil {
return nil, fmt.Errorf("failed to open RQLite SQL connection: %w", err)
}
return &RQLiteAdapter{
manager: manager,
db: db,
}, nil
}
// GetSQLDB returns the sql.DB interface for compatibility with existing storage service
func (a *RQLiteAdapter) GetSQLDB() *sql.DB {
return a.db
}
// GetManager returns the underlying RQLite manager for advanced operations
func (a *RQLiteAdapter) GetManager() *RQLiteManager {
return a.manager
}
// Close closes the adapter connections
func (a *RQLiteAdapter) Close() error {
if a.db != nil {
a.db.Close()
}
return a.manager.Stop()
}

View File

@ -1,835 +0,0 @@
package rqlite
// client.go defines the ORM-like interfaces and a minimal implementation over database/sql.
// It builds on the rqlite stdlib driver so it behaves like a regular SQL-backed ORM.
import (
"context"
"database/sql"
"errors"
"fmt"
"reflect"
"strings"
"time"
)
// TableNamer lets a struct provide its table name.
type TableNamer interface {
TableName() string
}
// Client is the high-level ORM-like API.
type Client interface {
// Query runs an arbitrary SELECT and scans rows into dest (pointer to slice of structs or []map[string]any).
Query(ctx context.Context, dest any, query string, args ...any) error
// Exec runs a write statement (INSERT/UPDATE/DELETE).
Exec(ctx context.Context, query string, args ...any) (sql.Result, error)
// FindBy/FindOneBy provide simple map-based criteria filtering.
FindBy(ctx context.Context, dest any, table string, criteria map[string]any, opts ...FindOption) error
FindOneBy(ctx context.Context, dest any, table string, criteria map[string]any, opts ...FindOption) error
// Save inserts or updates an entity (single-PK).
Save(ctx context.Context, entity any) error
// Remove deletes by PK (single-PK).
Remove(ctx context.Context, entity any) error
// Repositories (generic layer). Optional but convenient if you use Go generics.
Repository(table string) any
// Fluent query builder for advanced querying.
CreateQueryBuilder(table string) *QueryBuilder
// Tx executes a function within a transaction.
Tx(ctx context.Context, fn func(tx Tx) error) error
}
// Tx mirrors Client but executes within a transaction.
type Tx interface {
Query(ctx context.Context, dest any, query string, args ...any) error
Exec(ctx context.Context, query string, args ...any) (sql.Result, error)
CreateQueryBuilder(table string) *QueryBuilder
// Optional: scoped Save/Remove inside tx
Save(ctx context.Context, entity any) error
Remove(ctx context.Context, entity any) error
}
// Repository provides typed entity operations for a table.
type Repository[T any] interface {
Find(ctx context.Context, dest *[]T, criteria map[string]any, opts ...FindOption) error
FindOne(ctx context.Context, dest *T, criteria map[string]any, opts ...FindOption) error
Save(ctx context.Context, entity *T) error
Remove(ctx context.Context, entity *T) error
// Builder helpers
Q() *QueryBuilder
}
// NewClient wires the ORM client to a *sql.DB (from your RQLiteAdapter).
func NewClient(db *sql.DB) Client {
return &client{db: db}
}
// NewClientFromAdapter is convenient if you already created the adapter.
func NewClientFromAdapter(adapter *RQLiteAdapter) Client {
return NewClient(adapter.GetSQLDB())
}
// client implements Client over *sql.DB.
type client struct {
db *sql.DB
}
func (c *client) Query(ctx context.Context, dest any, query string, args ...any) error {
rows, err := c.db.QueryContext(ctx, query, args...)
if err != nil {
return err
}
defer rows.Close()
return scanIntoDest(rows, dest)
}
func (c *client) Exec(ctx context.Context, query string, args ...any) (sql.Result, error) {
return c.db.ExecContext(ctx, query, args...)
}
func (c *client) FindBy(ctx context.Context, dest any, table string, criteria map[string]any, opts ...FindOption) error {
qb := c.CreateQueryBuilder(table)
for k, v := range criteria {
qb = qb.AndWhere(fmt.Sprintf("%s = ?", k), v)
}
for _, opt := range opts {
opt(qb)
}
return qb.GetMany(ctx, dest)
}
func (c *client) FindOneBy(ctx context.Context, dest any, table string, criteria map[string]any, opts ...FindOption) error {
qb := c.CreateQueryBuilder(table)
for k, v := range criteria {
qb = qb.AndWhere(fmt.Sprintf("%s = ?", k), v)
}
for _, opt := range opts {
opt(qb)
}
return qb.GetOne(ctx, dest)
}
func (c *client) Save(ctx context.Context, entity any) error {
return saveEntity(ctx, c.db, entity)
}
func (c *client) Remove(ctx context.Context, entity any) error {
return removeEntity(ctx, c.db, entity)
}
func (c *client) Repository(table string) any {
// This returns an untyped interface since Go methods cannot have type parameters
// Users will need to type assert the result to Repository[T]
return func() any {
return &repository[any]{c: c, table: table}
}()
}
func (c *client) CreateQueryBuilder(table string) *QueryBuilder {
return newQueryBuilder(c.db, table)
}
func (c *client) Tx(ctx context.Context, fn func(tx Tx) error) error {
sqlTx, err := c.db.BeginTx(ctx, nil)
if err != nil {
return err
}
txc := &txClient{tx: sqlTx}
if err := fn(txc); err != nil {
_ = sqlTx.Rollback()
return err
}
return sqlTx.Commit()
}
// txClient implements Tx over *sql.Tx.
type txClient struct {
tx *sql.Tx
}
func (t *txClient) Query(ctx context.Context, dest any, query string, args ...any) error {
rows, err := t.tx.QueryContext(ctx, query, args...)
if err != nil {
return err
}
defer rows.Close()
return scanIntoDest(rows, dest)
}
func (t *txClient) Exec(ctx context.Context, query string, args ...any) (sql.Result, error) {
return t.tx.ExecContext(ctx, query, args...)
}
func (t *txClient) CreateQueryBuilder(table string) *QueryBuilder {
return newQueryBuilder(t.tx, table)
}
func (t *txClient) Save(ctx context.Context, entity any) error {
return saveEntity(ctx, t.tx, entity)
}
func (t *txClient) Remove(ctx context.Context, entity any) error {
return removeEntity(ctx, t.tx, entity)
}
// executor is implemented by *sql.DB and *sql.Tx.
type executor interface {
QueryContext(ctx context.Context, query string, args ...any) (*sql.Rows, error)
ExecContext(ctx context.Context, query string, args ...any) (sql.Result, error)
}
// QueryBuilder implements a fluent SELECT builder with joins, where, etc.
type QueryBuilder struct {
exec executor
table string
alias string
selects []string
joins []joinClause
wheres []whereClause
groupBys []string
orderBys []string
limit *int
offset *int
}
// joinClause represents INNER/LEFT/etc joins.
type joinClause struct {
kind string // "INNER", "LEFT", "JOIN" (default)
table string
on string
}
// whereClause holds an expression and args with a conjunction.
type whereClause struct {
conj string // "AND" or "OR"
expr string
args []any
}
func newQueryBuilder(exec executor, table string) *QueryBuilder {
return &QueryBuilder{
exec: exec,
table: table,
}
}
func (qb *QueryBuilder) Select(cols ...string) *QueryBuilder {
qb.selects = append(qb.selects, cols...)
return qb
}
func (qb *QueryBuilder) Alias(a string) *QueryBuilder {
qb.alias = a
return qb
}
func (qb *QueryBuilder) Where(expr string, args ...any) *QueryBuilder {
return qb.AndWhere(expr, args...)
}
func (qb *QueryBuilder) AndWhere(expr string, args ...any) *QueryBuilder {
qb.wheres = append(qb.wheres, whereClause{conj: "AND", expr: expr, args: args})
return qb
}
func (qb *QueryBuilder) OrWhere(expr string, args ...any) *QueryBuilder {
qb.wheres = append(qb.wheres, whereClause{conj: "OR", expr: expr, args: args})
return qb
}
func (qb *QueryBuilder) InnerJoin(table string, on string) *QueryBuilder {
qb.joins = append(qb.joins, joinClause{kind: "INNER", table: table, on: on})
return qb
}
func (qb *QueryBuilder) LeftJoin(table string, on string) *QueryBuilder {
qb.joins = append(qb.joins, joinClause{kind: "LEFT", table: table, on: on})
return qb
}
func (qb *QueryBuilder) Join(table string, on string) *QueryBuilder {
qb.joins = append(qb.joins, joinClause{kind: "JOIN", table: table, on: on})
return qb
}
func (qb *QueryBuilder) GroupBy(cols ...string) *QueryBuilder {
qb.groupBys = append(qb.groupBys, cols...)
return qb
}
func (qb *QueryBuilder) OrderBy(exprs ...string) *QueryBuilder {
qb.orderBys = append(qb.orderBys, exprs...)
return qb
}
func (qb *QueryBuilder) Limit(n int) *QueryBuilder {
qb.limit = &n
return qb
}
func (qb *QueryBuilder) Offset(n int) *QueryBuilder {
qb.offset = &n
return qb
}
// Build returns the SQL string and args for a SELECT.
func (qb *QueryBuilder) Build() (string, []any) {
cols := "*"
if len(qb.selects) > 0 {
cols = strings.Join(qb.selects, ", ")
}
base := fmt.Sprintf("SELECT %s FROM %s", cols, qb.table)
if qb.alias != "" {
base += " AS " + qb.alias
}
args := make([]any, 0, 16)
for _, j := range qb.joins {
base += fmt.Sprintf(" %s JOIN %s ON %s", j.kind, j.table, j.on)
}
if len(qb.wheres) > 0 {
base += " WHERE "
for i, w := range qb.wheres {
if i > 0 {
base += " " + w.conj + " "
}
base += "(" + w.expr + ")"
args = append(args, w.args...)
}
}
if len(qb.groupBys) > 0 {
base += " GROUP BY " + strings.Join(qb.groupBys, ", ")
}
if len(qb.orderBys) > 0 {
base += " ORDER BY " + strings.Join(qb.orderBys, ", ")
}
if qb.limit != nil {
base += fmt.Sprintf(" LIMIT %d", *qb.limit)
}
if qb.offset != nil {
base += fmt.Sprintf(" OFFSET %d", *qb.offset)
}
return base, args
}
// GetMany executes the built query and scans into dest (pointer to slice).
func (qb *QueryBuilder) GetMany(ctx context.Context, dest any) error {
sqlStr, args := qb.Build()
rows, err := qb.exec.QueryContext(ctx, sqlStr, args...)
if err != nil {
return err
}
defer rows.Close()
return scanIntoDest(rows, dest)
}
// GetOne executes the built query and scans into dest (pointer to struct or map) with LIMIT 1.
func (qb *QueryBuilder) GetOne(ctx context.Context, dest any) error {
limit := 1
if qb.limit == nil {
qb.limit = &limit
} else if qb.limit != nil && *qb.limit > 1 {
qb.limit = &limit
}
sqlStr, args := qb.Build()
rows, err := qb.exec.QueryContext(ctx, sqlStr, args...)
if err != nil {
return err
}
defer rows.Close()
if !rows.Next() {
return sql.ErrNoRows
}
return scanIntoSingle(rows, dest)
}
// FindOption customizes Find queries.
type FindOption func(q *QueryBuilder)
func WithOrderBy(exprs ...string) FindOption {
return func(q *QueryBuilder) { q.OrderBy(exprs...) }
}
func WithGroupBy(cols ...string) FindOption {
return func(q *QueryBuilder) { q.GroupBy(cols...) }
}
func WithLimit(n int) FindOption {
return func(q *QueryBuilder) { q.Limit(n) }
}
func WithOffset(n int) FindOption {
return func(q *QueryBuilder) { q.Offset(n) }
}
func WithSelect(cols ...string) FindOption {
return func(q *QueryBuilder) { q.Select(cols...) }
}
func WithJoin(kind, table, on string) FindOption {
return func(q *QueryBuilder) {
switch strings.ToUpper(kind) {
case "INNER":
q.InnerJoin(table, on)
case "LEFT":
q.LeftJoin(table, on)
default:
q.Join(table, on)
}
}
}
// repository is a generic table repository for type T.
type repository[T any] struct {
c *client
table string
}
func (r *repository[T]) Find(ctx context.Context, dest *[]T, criteria map[string]any, opts ...FindOption) error {
qb := r.c.CreateQueryBuilder(r.table)
for k, v := range criteria {
qb.AndWhere(fmt.Sprintf("%s = ?", k), v)
}
for _, opt := range opts {
opt(qb)
}
return qb.GetMany(ctx, dest)
}
func (r *repository[T]) FindOne(ctx context.Context, dest *T, criteria map[string]any, opts ...FindOption) error {
qb := r.c.CreateQueryBuilder(r.table)
for k, v := range criteria {
qb.AndWhere(fmt.Sprintf("%s = ?", k), v)
}
for _, opt := range opts {
opt(qb)
}
return qb.GetOne(ctx, dest)
}
func (r *repository[T]) Save(ctx context.Context, entity *T) error {
return saveEntity(ctx, r.c.db, entity)
}
func (r *repository[T]) Remove(ctx context.Context, entity *T) error {
return removeEntity(ctx, r.c.db, entity)
}
func (r *repository[T]) Q() *QueryBuilder {
return r.c.CreateQueryBuilder(r.table)
}
// -----------------------
// Reflection + scanning
// -----------------------
func scanIntoDest(rows *sql.Rows, dest any) error {
// dest must be pointer to slice (of struct or map)
rv := reflect.ValueOf(dest)
if rv.Kind() != reflect.Pointer || rv.IsNil() {
return errors.New("dest must be a non-nil pointer")
}
sliceVal := rv.Elem()
if sliceVal.Kind() != reflect.Slice {
return errors.New("dest must be pointer to a slice")
}
elemType := sliceVal.Type().Elem()
cols, err := rows.Columns()
if err != nil {
return err
}
for rows.Next() {
itemPtr := reflect.New(elemType)
// Support map[string]any and struct
if elemType.Kind() == reflect.Map {
m, err := scanRowToMap(rows, cols)
if err != nil {
return err
}
sliceVal.Set(reflect.Append(sliceVal, reflect.ValueOf(m)))
continue
}
if elemType.Kind() == reflect.Struct {
if err := scanCurrentRowIntoStruct(rows, cols, itemPtr.Elem()); err != nil {
return err
}
sliceVal.Set(reflect.Append(sliceVal, itemPtr.Elem()))
continue
}
return fmt.Errorf("unsupported slice element type: %s", elemType.Kind())
}
return rows.Err()
}
func scanIntoSingle(rows *sql.Rows, dest any) error {
rv := reflect.ValueOf(dest)
if rv.Kind() != reflect.Pointer || rv.IsNil() {
return errors.New("dest must be a non-nil pointer")
}
cols, err := rows.Columns()
if err != nil {
return err
}
switch rv.Elem().Kind() {
case reflect.Map:
m, err := scanRowToMap(rows, cols)
if err != nil {
return err
}
rv.Elem().Set(reflect.ValueOf(m))
return nil
case reflect.Struct:
return scanCurrentRowIntoStruct(rows, cols, rv.Elem())
default:
return fmt.Errorf("unsupported dest kind: %s", rv.Elem().Kind())
}
}
func scanRowToMap(rows *sql.Rows, cols []string) (map[string]any, error) {
raw := make([]any, len(cols))
ptrs := make([]any, len(cols))
for i := range raw {
ptrs[i] = &raw[i]
}
if err := rows.Scan(ptrs...); err != nil {
return nil, err
}
out := make(map[string]any, len(cols))
for i, c := range cols {
out[c] = normalizeSQLValue(raw[i])
}
return out, nil
}
func scanCurrentRowIntoStruct(rows *sql.Rows, cols []string, destStruct reflect.Value) error {
raw := make([]any, len(cols))
ptrs := make([]any, len(cols))
for i := range raw {
ptrs[i] = &raw[i]
}
if err := rows.Scan(ptrs...); err != nil {
return err
}
fieldIndex := buildFieldIndex(destStruct.Type())
for i, c := range cols {
if idx, ok := fieldIndex[strings.ToLower(c)]; ok {
field := destStruct.Field(idx)
if field.CanSet() {
if err := setReflectValue(field, raw[i]); err != nil {
return fmt.Errorf("column %s: %w", c, err)
}
}
}
}
return nil
}
func normalizeSQLValue(v any) any {
switch t := v.(type) {
case []byte:
return string(t)
default:
return v
}
}
func buildFieldIndex(t reflect.Type) map[string]int {
m := make(map[string]int)
for i := 0; i < t.NumField(); i++ {
f := t.Field(i)
if f.IsExported() == false {
continue
}
tag := f.Tag.Get("db")
col := ""
if tag != "" {
col = strings.Split(tag, ",")[0]
}
if col == "" {
col = f.Name
}
m[strings.ToLower(col)] = i
}
return m
}
func setReflectValue(field reflect.Value, raw any) error {
if raw == nil {
// leave zero value
return nil
}
switch field.Kind() {
case reflect.String:
switch v := raw.(type) {
case string:
field.SetString(v)
case []byte:
field.SetString(string(v))
default:
field.SetString(fmt.Sprint(v))
}
case reflect.Bool:
switch v := raw.(type) {
case bool:
field.SetBool(v)
case int64:
field.SetBool(v != 0)
case []byte:
s := string(v)
field.SetBool(s == "1" || strings.EqualFold(s, "true"))
default:
field.SetBool(false)
}
case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64:
switch v := raw.(type) {
case int64:
field.SetInt(v)
case []byte:
var n int64
fmt.Sscan(string(v), &n)
field.SetInt(n)
default:
return fmt.Errorf("cannot convert %T to int", raw)
}
case reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64:
switch v := raw.(type) {
case int64:
if v < 0 {
v = 0
}
field.SetUint(uint64(v))
case []byte:
var n uint64
fmt.Sscan(string(v), &n)
field.SetUint(n)
default:
return fmt.Errorf("cannot convert %T to uint", raw)
}
case reflect.Float32, reflect.Float64:
switch v := raw.(type) {
case float64:
field.SetFloat(v)
case []byte:
var fv float64
fmt.Sscan(string(v), &fv)
field.SetFloat(fv)
default:
return fmt.Errorf("cannot convert %T to float", raw)
}
case reflect.Struct:
// Support time.Time; extend as needed.
if field.Type() == reflect.TypeOf(time.Time{}) {
switch v := raw.(type) {
case time.Time:
field.Set(reflect.ValueOf(v))
case []byte:
// Try RFC3339
if tt, err := time.Parse(time.RFC3339, string(v)); err == nil {
field.Set(reflect.ValueOf(tt))
}
}
return nil
}
fallthrough
default:
// Not supported yet
return fmt.Errorf("unsupported dest field kind: %s", field.Kind())
}
return nil
}
// -----------------------
// Save/Remove (basic PK)
// -----------------------
type fieldMeta struct {
index int
column string
isPK bool
auto bool
}
func collectMeta(t reflect.Type) (fields []fieldMeta, pk fieldMeta, hasPK bool) {
for i := 0; i < t.NumField(); i++ {
f := t.Field(i)
if !f.IsExported() {
continue
}
tag := f.Tag.Get("db")
if tag == "-" {
continue
}
opts := strings.Split(tag, ",")
col := opts[0]
if col == "" {
col = f.Name
}
meta := fieldMeta{index: i, column: col}
for _, o := range opts[1:] {
switch strings.ToLower(strings.TrimSpace(o)) {
case "pk":
meta.isPK = true
case "auto", "autoincrement":
meta.auto = true
}
}
// If not tagged as pk, fallback to field name "ID"
if !meta.isPK && f.Name == "ID" {
meta.isPK = true
if col == "" {
meta.column = "id"
}
}
fields = append(fields, meta)
if meta.isPK {
pk = meta
hasPK = true
}
}
return
}
func getTableNameFromEntity(v reflect.Value) (string, bool) {
// If entity implements TableNamer
if v.CanInterface() {
if tn, ok := v.Interface().(TableNamer); ok {
return tn.TableName(), true
}
}
// Fallback: very naive pluralization (append 's')
typ := v.Type()
if typ.Kind() == reflect.Pointer {
typ = typ.Elem()
}
if typ.Kind() == reflect.Struct {
return strings.ToLower(typ.Name()) + "s", true
}
return "", false
}
func saveEntity(ctx context.Context, exec executor, entity any) error {
rv := reflect.ValueOf(entity)
if rv.Kind() != reflect.Pointer || rv.IsNil() {
return errors.New("entity must be a non-nil pointer to struct")
}
ev := rv.Elem()
if ev.Kind() != reflect.Struct {
return errors.New("entity must point to a struct")
}
fields, pkMeta, hasPK := collectMeta(ev.Type())
if !hasPK {
return errors.New("no primary key field found (tag db:\"...,pk\" or field named ID)")
}
table, ok := getTableNameFromEntity(ev)
if !ok || table == "" {
return errors.New("unable to resolve table name; implement TableNamer or set up a repository with explicit table")
}
// Build lists
cols := make([]string, 0, len(fields))
vals := make([]any, 0, len(fields))
setParts := make([]string, 0, len(fields))
var pkVal any
var pkIsZero bool
for _, fm := range fields {
f := ev.Field(fm.index)
if fm.isPK {
pkVal = f.Interface()
pkIsZero = isZeroValue(f)
continue
}
cols = append(cols, fm.column)
vals = append(vals, f.Interface())
setParts = append(setParts, fmt.Sprintf("%s = ?", fm.column))
}
if pkIsZero {
// INSERT
placeholders := strings.Repeat("?,", len(cols))
if len(placeholders) > 0 {
placeholders = placeholders[:len(placeholders)-1]
}
sqlStr := fmt.Sprintf("INSERT INTO %s (%s) VALUES (%s)", table, strings.Join(cols, ", "), placeholders)
res, err := exec.ExecContext(ctx, sqlStr, vals...)
if err != nil {
return err
}
// Set auto ID if needed
if pkMeta.auto {
if id, err := res.LastInsertId(); err == nil {
ev.Field(pkMeta.index).SetInt(id)
}
}
return nil
}
// UPDATE ... WHERE pk = ?
sqlStr := fmt.Sprintf("UPDATE %s SET %s WHERE %s = ?", table, strings.Join(setParts, ", "), pkMeta.column)
valsWithPK := append(vals, pkVal)
_, err := exec.ExecContext(ctx, sqlStr, valsWithPK...)
return err
}
func removeEntity(ctx context.Context, exec executor, entity any) error {
rv := reflect.ValueOf(entity)
if rv.Kind() != reflect.Pointer || rv.IsNil() {
return errors.New("entity must be a non-nil pointer to struct")
}
ev := rv.Elem()
if ev.Kind() != reflect.Struct {
return errors.New("entity must point to a struct")
}
_, pkMeta, hasPK := collectMeta(ev.Type())
if !hasPK {
return errors.New("no primary key field found")
}
table, ok := getTableNameFromEntity(ev)
if !ok || table == "" {
return errors.New("unable to resolve table name")
}
pkVal := ev.Field(pkMeta.index).Interface()
sqlStr := fmt.Sprintf("DELETE FROM %s WHERE %s = ?", table, pkMeta.column)
_, err := exec.ExecContext(ctx, sqlStr, pkVal)
return err
}
func isZeroValue(v reflect.Value) bool {
switch v.Kind() {
case reflect.String:
return v.Len() == 0
case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64:
return v.Int() == 0
case reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64:
return v.Uint() == 0
case reflect.Bool:
return v.Bool() == false
case reflect.Pointer, reflect.Interface:
return v.IsNil()
case reflect.Slice, reflect.Map:
return v.Len() == 0
case reflect.Struct:
// Special-case time.Time
if v.Type() == reflect.TypeOf(time.Time{}) {
t := v.Interface().(time.Time)
return t.IsZero()
}
zero := reflect.Zero(v.Type())
return reflect.DeepEqual(v.Interface(), zero.Interface())
default:
return false
}
}

View File

@ -0,0 +1,902 @@
package rqlite
import (
"context"
"fmt"
"time"
"go.uber.org/zap"
)
// handleCreateRequest processes a database creation request
func (cm *ClusterManager) handleCreateRequest(msg *MetadataMessage) error {
var req DatabaseCreateRequest
if err := msg.UnmarshalPayload(&req); err != nil {
return err
}
cm.logger.Info("Received database create request",
zap.String("database", req.DatabaseName),
zap.String("requester", req.RequesterNodeID),
zap.Int("replication_factor", req.ReplicationFactor))
// Check if we can host this database
cm.mu.RLock()
currentCount := len(cm.activeClusters)
cm.mu.RUnlock()
if currentCount >= cm.config.MaxDatabases {
cm.logger.Debug("Cannot host database: at capacity",
zap.String("database", req.DatabaseName),
zap.Int("current", currentCount),
zap.Int("max", cm.config.MaxDatabases))
return nil
}
// Allocate ports
ports, err := cm.portManager.AllocatePortPair(req.DatabaseName)
if err != nil {
cm.logger.Warn("Cannot allocate ports for database",
zap.String("database", req.DatabaseName),
zap.Error(err))
return nil
}
// Send response offering to host
response := DatabaseCreateResponse{
DatabaseName: req.DatabaseName,
NodeID: cm.nodeID,
AvailablePorts: ports,
}
msgData, err := MarshalMetadataMessage(MsgDatabaseCreateResponse, cm.nodeID, response)
if err != nil {
cm.portManager.ReleasePortPair(ports)
return fmt.Errorf("failed to marshal create response: %w", err)
}
topic := "/debros/metadata/v1"
if err := cm.pubsubAdapter.Publish(cm.ctx, topic, msgData); err != nil {
cm.portManager.ReleasePortPair(ports)
return fmt.Errorf("failed to publish create response: %w", err)
}
cm.logger.Info("Sent database create response",
zap.String("database", req.DatabaseName),
zap.Int("http_port", ports.HTTPPort),
zap.Int("raft_port", ports.RaftPort))
return nil
}
// handleCreateResponse processes a database creation response
func (cm *ClusterManager) handleCreateResponse(msg *MetadataMessage) error {
var response DatabaseCreateResponse
if err := msg.UnmarshalPayload(&response); err != nil {
return err
}
cm.logger.Debug("Received database create response",
zap.String("database", response.DatabaseName),
zap.String("node", response.NodeID))
// Forward to coordinator registry
cm.coordinatorRegistry.HandleCreateResponse(response)
return nil
}
// handleCreateConfirm processes a database creation confirmation
func (cm *ClusterManager) handleCreateConfirm(msg *MetadataMessage) error {
var confirm DatabaseCreateConfirm
if err := msg.UnmarshalPayload(&confirm); err != nil {
return err
}
cm.logger.Info("Received database create confirm",
zap.String("database", confirm.DatabaseName),
zap.String("coordinator", confirm.CoordinatorNodeID),
zap.Int("nodes", len(confirm.SelectedNodes)))
// Check if this node was selected
var myAssignment *NodeAssignment
for i, node := range confirm.SelectedNodes {
if node.NodeID == cm.nodeID {
myAssignment = &confirm.SelectedNodes[i]
break
}
}
if myAssignment == nil {
cm.logger.Debug("Not selected for this database",
zap.String("database", confirm.DatabaseName))
return nil
}
cm.logger.Info("Selected to host database",
zap.String("database", confirm.DatabaseName),
zap.String("role", myAssignment.Role))
// Create database metadata
portMappings := make(map[string]PortPair)
nodeIDs := make([]string, len(confirm.SelectedNodes))
for i, node := range confirm.SelectedNodes {
nodeIDs[i] = node.NodeID
portMappings[node.NodeID] = PortPair{
HTTPPort: node.HTTPPort,
RaftPort: node.RaftPort,
}
}
metadata := &DatabaseMetadata{
DatabaseName: confirm.DatabaseName,
NodeIDs: nodeIDs,
PortMappings: portMappings,
Status: StatusInitializing,
CreatedAt: time.Now(),
LastAccessed: time.Now(),
LeaderNodeID: confirm.SelectedNodes[0].NodeID, // First node is leader
Version: 1,
VectorClock: NewVectorClock(),
}
// Update vector clock
UpdateDatabaseMetadata(metadata, cm.nodeID)
// Store metadata
cm.metadataStore.SetDatabase(metadata)
// Start the RQLite instance
go cm.startDatabaseInstance(metadata, myAssignment.Role == "leader")
return nil
}
// startDatabaseInstance starts a database instance on this node
func (cm *ClusterManager) startDatabaseInstance(metadata *DatabaseMetadata, isLeader bool) {
ports := metadata.PortMappings[cm.nodeID]
// Create advertised addresses
advHTTPAddr := fmt.Sprintf("%s:%d", cm.getAdvertiseAddress(), ports.HTTPPort)
advRaftAddr := fmt.Sprintf("%s:%d", cm.getAdvertiseAddress(), ports.RaftPort)
// Create instance
instance := NewRQLiteInstance(
metadata.DatabaseName,
ports,
cm.dataDir,
advHTTPAddr,
advRaftAddr,
cm.logger,
)
// Determine join address (if follower)
var joinAddr string
if !isLeader && len(metadata.NodeIDs) > 0 {
// Join to the leader
leaderNodeID := metadata.LeaderNodeID
if leaderPorts, exists := metadata.PortMappings[leaderNodeID]; exists {
joinAddr = fmt.Sprintf("%s:%d", cm.getAdvertiseAddress(), leaderPorts.RaftPort)
}
}
// Start the instance
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
if err := instance.Start(ctx, isLeader, joinAddr); err != nil {
cm.logger.Error("Failed to start database instance",
zap.String("database", metadata.DatabaseName),
zap.Error(err))
// Broadcast failure status
cm.broadcastStatusUpdate(metadata.DatabaseName, StatusInitializing)
return
}
// Store active instance
cm.mu.Lock()
cm.activeClusters[metadata.DatabaseName] = instance
cm.mu.Unlock()
// Broadcast active status
cm.broadcastStatusUpdate(metadata.DatabaseName, StatusActive)
cm.logger.Info("Database instance started and active",
zap.String("database", metadata.DatabaseName))
}
// handleStatusUpdate processes database status updates
func (cm *ClusterManager) handleStatusUpdate(msg *MetadataMessage) error {
var update DatabaseStatusUpdate
if err := msg.UnmarshalPayload(&update); err != nil {
return err
}
cm.logger.Debug("Received status update",
zap.String("database", update.DatabaseName),
zap.String("node", update.NodeID),
zap.String("status", string(update.Status)))
// Update metadata
if metadata := cm.metadataStore.GetDatabase(update.DatabaseName); metadata != nil {
metadata.Status = update.Status
metadata.LastAccessed = time.Now()
cm.metadataStore.SetDatabase(metadata)
}
return nil
}
// handleCapacityAnnouncement processes node capacity announcements
func (cm *ClusterManager) handleCapacityAnnouncement(msg *MetadataMessage) error {
var announcement NodeCapacityAnnouncement
if err := msg.UnmarshalPayload(&announcement); err != nil {
return err
}
capacity := &NodeCapacity{
NodeID: announcement.NodeID,
MaxDatabases: announcement.MaxDatabases,
CurrentDatabases: announcement.CurrentDatabases,
PortRangeHTTP: announcement.PortRangeHTTP,
PortRangeRaft: announcement.PortRangeRaft,
LastHealthCheck: time.Now(),
IsHealthy: true,
}
cm.metadataStore.SetNode(capacity)
return nil
}
// handleHealthPing processes health ping messages
func (cm *ClusterManager) handleHealthPing(msg *MetadataMessage) error {
var ping NodeHealthPing
if err := msg.UnmarshalPayload(&ping); err != nil {
return err
}
// Respond with pong
pong := NodeHealthPong{
NodeID: cm.nodeID,
Healthy: true,
PingFrom: ping.NodeID,
}
msgData, err := MarshalMetadataMessage(MsgNodeHealthPong, cm.nodeID, pong)
if err != nil {
return err
}
topic := "/debros/metadata/v1"
return cm.pubsubAdapter.Publish(cm.ctx, topic, msgData)
}
// handleMetadataSync processes metadata synchronization messages
func (cm *ClusterManager) handleMetadataSync(msg *MetadataMessage) error {
var sync MetadataSync
if err := msg.UnmarshalPayload(&sync); err != nil {
return err
}
if sync.Metadata == nil {
return nil
}
// Check if we need to update local metadata
existing := cm.metadataStore.GetDatabase(sync.Metadata.DatabaseName)
if existing == nil {
// New database we didn't know about
cm.metadataStore.SetDatabase(sync.Metadata)
cm.logger.Info("Learned about new database via sync",
zap.String("database", sync.Metadata.DatabaseName))
return nil
}
// Resolve conflict if versions differ
winner := ResolveConflict(existing, sync.Metadata)
if winner != existing {
cm.metadataStore.SetDatabase(winner)
cm.logger.Info("Updated database metadata via sync",
zap.String("database", sync.Metadata.DatabaseName))
}
return nil
}
// handleChecksumRequest processes checksum requests
func (cm *ClusterManager) handleChecksumRequest(msg *MetadataMessage) error {
var req MetadataChecksumRequest
if err := msg.UnmarshalPayload(&req); err != nil {
return err
}
// Compute checksums for all databases
checksums := ComputeFullStateChecksum(cm.metadataStore)
// Send response
response := MetadataChecksumResponse{
RequestID: req.RequestID,
Checksums: checksums,
}
msgData, err := MarshalMetadataMessage(MsgMetadataChecksumRes, cm.nodeID, response)
if err != nil {
return err
}
topic := "/debros/metadata/v1"
return cm.pubsubAdapter.Publish(cm.ctx, topic, msgData)
}
// handleChecksumResponse processes checksum responses
func (cm *ClusterManager) handleChecksumResponse(msg *MetadataMessage) error {
var response MetadataChecksumResponse
if err := msg.UnmarshalPayload(&response); err != nil {
return err
}
// Compare with local checksums
localChecksums := ComputeFullStateChecksum(cm.metadataStore)
localMap := make(map[string]MetadataChecksum)
for _, cs := range localChecksums {
localMap[cs.DatabaseName] = cs
}
// Check for differences
for _, remoteCS := range response.Checksums {
localCS, exists := localMap[remoteCS.DatabaseName]
if !exists {
// Database we don't know about - request full metadata
cm.logger.Info("Discovered database via checksum",
zap.String("database", remoteCS.DatabaseName))
// TODO: Request full metadata for this database
continue
}
if localCS.Hash != remoteCS.Hash {
cm.logger.Info("Database metadata diverged",
zap.String("database", remoteCS.DatabaseName))
// TODO: Request full metadata for this database
}
}
return nil
}
// broadcastStatusUpdate broadcasts a status update for a database
func (cm *ClusterManager) broadcastStatusUpdate(dbName string, status DatabaseStatus) {
cm.mu.RLock()
instance := cm.activeClusters[dbName]
cm.mu.RUnlock()
update := DatabaseStatusUpdate{
DatabaseName: dbName,
NodeID: cm.nodeID,
Status: status,
}
if instance != nil {
update.HTTPPort = instance.HTTPPort
update.RaftPort = instance.RaftPort
}
msgData, err := MarshalMetadataMessage(MsgDatabaseStatusUpdate, cm.nodeID, update)
if err != nil {
cm.logger.Warn("Failed to marshal status update", zap.Error(err))
return
}
topic := "/debros/metadata/v1"
if err := cm.pubsubAdapter.Publish(cm.ctx, topic, msgData); err != nil {
cm.logger.Warn("Failed to publish status update", zap.Error(err))
}
}
// getAdvertiseAddress returns the advertise address for this node
func (cm *ClusterManager) getAdvertiseAddress() string {
if cm.discoveryConfig.HttpAdvAddress != "" {
// Extract just the host part (remove port if present)
addr := cm.discoveryConfig.HttpAdvAddress
if idx := len(addr) - 1; idx >= 0 {
for i := len(addr) - 1; i >= 0; i-- {
if addr[i] == ':' {
return addr[:i]
}
}
}
return addr
}
return "127.0.0.1"
}
// handleIdleNotification processes idle notifications from other nodes
func (cm *ClusterManager) handleIdleNotification(msg *MetadataMessage) error {
var notification DatabaseIdleNotification
if err := msg.UnmarshalPayload(&notification); err != nil {
return err
}
cm.logger.Debug("Received idle notification",
zap.String("database", notification.DatabaseName),
zap.String("from_node", notification.NodeID))
// Get database metadata
dbMeta := cm.metadataStore.GetDatabase(notification.DatabaseName)
if dbMeta == nil {
cm.logger.Debug("Idle notification for unknown database",
zap.String("database", notification.DatabaseName))
return nil
}
// Track idle count (simple approach: if we see idle from all nodes, coordinate shutdown)
// In production, this would use a more sophisticated quorum mechanism
idleCount := 0
for _, nodeID := range dbMeta.NodeIDs {
if nodeID == notification.NodeID || nodeID == cm.nodeID {
idleCount++
}
}
// If all nodes are idle, coordinate shutdown
if idleCount >= len(dbMeta.NodeIDs) {
cm.logger.Info("All nodes idle for database, coordinating shutdown",
zap.String("database", notification.DatabaseName))
// Elect coordinator
coordinator := SelectCoordinator(dbMeta.NodeIDs)
if coordinator == cm.nodeID {
// This node is coordinator, initiate shutdown
shutdown := DatabaseShutdownCoordinated{
DatabaseName: notification.DatabaseName,
ShutdownTime: time.Now().Add(5 * time.Second), // Grace period
}
msgData, err := MarshalMetadataMessage(MsgDatabaseShutdownCoordinated, cm.nodeID, shutdown)
if err != nil {
return fmt.Errorf("failed to marshal shutdown message: %w", err)
}
topic := "/debros/metadata/v1"
if err := cm.pubsubAdapter.Publish(cm.ctx, topic, msgData); err != nil {
return fmt.Errorf("failed to publish shutdown message: %w", err)
}
cm.logger.Info("Coordinated shutdown message sent",
zap.String("database", notification.DatabaseName))
}
}
return nil
}
// handleShutdownCoordinated processes coordinated shutdown messages
func (cm *ClusterManager) handleShutdownCoordinated(msg *MetadataMessage) error {
var shutdown DatabaseShutdownCoordinated
if err := msg.UnmarshalPayload(&shutdown); err != nil {
return err
}
cm.logger.Info("Received coordinated shutdown",
zap.String("database", shutdown.DatabaseName),
zap.Time("shutdown_time", shutdown.ShutdownTime))
// Get database metadata
dbMeta := cm.metadataStore.GetDatabase(shutdown.DatabaseName)
if dbMeta == nil {
cm.logger.Debug("Shutdown for unknown database",
zap.String("database", shutdown.DatabaseName))
return nil
}
// Check if this node is a member
isMember := false
for _, nodeID := range dbMeta.NodeIDs {
if nodeID == cm.nodeID {
isMember = true
break
}
}
if !isMember {
return nil
}
// Wait until shutdown time
waitDuration := time.Until(shutdown.ShutdownTime)
if waitDuration > 0 {
cm.logger.Debug("Waiting for shutdown time",
zap.String("database", shutdown.DatabaseName),
zap.Duration("wait", waitDuration))
time.Sleep(waitDuration)
}
// Stop the instance
cm.mu.Lock()
instance, exists := cm.activeClusters[shutdown.DatabaseName]
if exists {
cm.logger.Info("Stopping database instance for hibernation",
zap.String("database", shutdown.DatabaseName))
if err := instance.Stop(); err != nil {
cm.logger.Error("Failed to stop instance", zap.Error(err))
cm.mu.Unlock()
return err
}
// Free ports
ports := PortPair{HTTPPort: instance.HTTPPort, RaftPort: instance.RaftPort}
cm.portManager.ReleasePortPair(ports)
// Remove from active clusters
delete(cm.activeClusters, shutdown.DatabaseName)
}
cm.mu.Unlock()
// Update metadata status to hibernating
dbMeta.Status = StatusHibernating
dbMeta.LastAccessed = time.Now()
cm.metadataStore.SetDatabase(dbMeta)
// Broadcast status update
cm.broadcastStatusUpdate(shutdown.DatabaseName, StatusHibernating)
cm.logger.Info("Database hibernated successfully",
zap.String("database", shutdown.DatabaseName))
return nil
}
// handleWakeupRequest processes wake-up requests for hibernating databases
func (cm *ClusterManager) handleWakeupRequest(msg *MetadataMessage) error {
var wakeup DatabaseWakeupRequest
if err := msg.UnmarshalPayload(&wakeup); err != nil {
return err
}
cm.logger.Info("Received wakeup request",
zap.String("database", wakeup.DatabaseName),
zap.String("requester", wakeup.RequesterNodeID))
// Get database metadata
dbMeta := cm.metadataStore.GetDatabase(wakeup.DatabaseName)
if dbMeta == nil {
cm.logger.Warn("Wakeup request for unknown database",
zap.String("database", wakeup.DatabaseName))
return nil
}
// Check if database is hibernating
if dbMeta.Status != StatusHibernating {
cm.logger.Debug("Database not hibernating, ignoring wakeup",
zap.String("database", wakeup.DatabaseName),
zap.String("status", string(dbMeta.Status)))
return nil
}
// Check if this node is a member
isMember := false
for _, nodeID := range dbMeta.NodeIDs {
if nodeID == cm.nodeID {
isMember = true
break
}
}
if !isMember {
return nil
}
// Update status to waking
dbMeta.Status = StatusWaking
dbMeta.LastAccessed = time.Now()
cm.metadataStore.SetDatabase(dbMeta)
// Start the instance
go cm.wakeupDatabase(wakeup.DatabaseName, dbMeta)
return nil
}
// wakeupDatabase starts a hibernating database
func (cm *ClusterManager) wakeupDatabase(dbName string, dbMeta *DatabaseMetadata) {
cm.logger.Info("Waking up database", zap.String("database", dbName))
// Get port mapping for this node
ports, exists := dbMeta.PortMappings[cm.nodeID]
if !exists {
cm.logger.Error("No port mapping found for node",
zap.String("database", dbName),
zap.String("node", cm.nodeID))
return
}
// Try to allocate the same ports (or new ones if taken)
allocatedPorts := ports
if cm.portManager.IsPortAllocated(ports.HTTPPort) || cm.portManager.IsPortAllocated(ports.RaftPort) {
cm.logger.Warn("Original ports taken, allocating new ones",
zap.String("database", dbName))
newPorts, err := cm.portManager.AllocatePortPair(dbName)
if err != nil {
cm.logger.Error("Failed to allocate ports for wakeup", zap.Error(err))
return
}
allocatedPorts = newPorts
// Update port mapping in metadata
dbMeta.PortMappings[cm.nodeID] = allocatedPorts
cm.metadataStore.SetDatabase(dbMeta)
} else {
// Mark ports as allocated
if err := cm.portManager.AllocateSpecificPorts(dbName, ports); err != nil {
cm.logger.Error("Failed to allocate specific ports", zap.Error(err))
return
}
}
// Determine join address (first node in the list)
joinAddr := ""
if len(dbMeta.NodeIDs) > 0 && dbMeta.NodeIDs[0] != cm.nodeID {
firstNodePorts := dbMeta.PortMappings[dbMeta.NodeIDs[0]]
joinAddr = fmt.Sprintf("http://%s:%d", cm.getAdvertiseAddress(), firstNodePorts.RaftPort)
}
// Create and start instance
instance := NewRQLiteInstance(
dbName,
allocatedPorts,
cm.dataDir,
cm.getAdvertiseAddress(),
cm.getAdvertiseAddress(),
cm.logger,
)
// Determine if this is the leader (first node)
isLeader := len(dbMeta.NodeIDs) > 0 && dbMeta.NodeIDs[0] == cm.nodeID
if err := instance.Start(cm.ctx, isLeader, joinAddr); err != nil {
cm.logger.Error("Failed to start instance during wakeup", zap.Error(err))
cm.portManager.ReleasePortPair(allocatedPorts)
return
}
// Add to active clusters
cm.mu.Lock()
cm.activeClusters[dbName] = instance
cm.mu.Unlock()
// Update metadata status to active
dbMeta.Status = StatusActive
dbMeta.LastAccessed = time.Now()
cm.metadataStore.SetDatabase(dbMeta)
// Broadcast status update
cm.broadcastStatusUpdate(dbName, StatusActive)
cm.logger.Info("Database woke up successfully", zap.String("database", dbName))
}
// handleNodeReplacementNeeded processes requests to replace a failed node
func (cm *ClusterManager) handleNodeReplacementNeeded(msg *MetadataMessage) error {
var replacement NodeReplacementNeeded
if err := msg.UnmarshalPayload(&replacement); err != nil {
return err
}
cm.logger.Info("Received node replacement needed",
zap.String("database", replacement.DatabaseName),
zap.String("failed_node", replacement.FailedNodeID))
// Get database metadata
dbMeta := cm.metadataStore.GetDatabase(replacement.DatabaseName)
if dbMeta == nil {
cm.logger.Warn("Replacement needed for unknown database",
zap.String("database", replacement.DatabaseName))
return nil
}
// Check if we're eligible to replace (not at capacity and healthy)
nodeCapacity := cm.metadataStore.GetNode(cm.nodeID)
if nodeCapacity == nil || nodeCapacity.CurrentDatabases >= nodeCapacity.MaxDatabases {
cm.logger.Debug("Not eligible for replacement - at capacity",
zap.String("database", replacement.DatabaseName))
return nil
}
// Check if we're not already a member
for _, nodeID := range dbMeta.NodeIDs {
if nodeID == cm.nodeID {
cm.logger.Debug("Already a member of this database",
zap.String("database", replacement.DatabaseName))
return nil
}
}
// Allocate ports for potential replacement
ports, err := cm.portManager.AllocatePortPair(replacement.DatabaseName)
if err != nil {
cm.logger.Warn("Cannot allocate ports for replacement",
zap.String("database", replacement.DatabaseName),
zap.Error(err))
return nil
}
// Send replacement offer
response := NodeReplacementOffer{
DatabaseName: replacement.DatabaseName,
NodeID: cm.nodeID,
AvailablePorts: ports,
}
msgData, err := MarshalMetadataMessage(MsgNodeReplacementOffer, cm.nodeID, response)
if err != nil {
cm.portManager.ReleasePortPair(ports)
return fmt.Errorf("failed to marshal replacement offer: %w", err)
}
topic := "/debros/metadata/v1"
if err := cm.pubsubAdapter.Publish(cm.ctx, topic, msgData); err != nil {
cm.portManager.ReleasePortPair(ports)
return fmt.Errorf("failed to publish replacement offer: %w", err)
}
cm.logger.Info("Sent replacement offer",
zap.String("database", replacement.DatabaseName))
return nil
}
// handleNodeReplacementOffer processes offers from nodes to replace a failed node
func (cm *ClusterManager) handleNodeReplacementOffer(msg *MetadataMessage) error {
var offer NodeReplacementOffer
if err := msg.UnmarshalPayload(&offer); err != nil {
return err
}
cm.logger.Debug("Received replacement offer",
zap.String("database", offer.DatabaseName),
zap.String("from_node", offer.NodeID))
// This would be handled by the coordinator who initiated the replacement request
// For now, we'll implement a simple first-come-first-served approach
// In production, this would involve collecting offers and selecting the best node
dbMeta := cm.metadataStore.GetDatabase(offer.DatabaseName)
if dbMeta == nil {
return nil
}
// Check if we're a surviving member and should coordinate
isMember := false
for _, nodeID := range dbMeta.NodeIDs {
if nodeID == cm.nodeID {
isMember = true
break
}
}
if !isMember {
return nil
}
// Simple approach: accept first offer
// In production: collect offers, select based on capacity/health
cm.logger.Info("Accepting replacement offer",
zap.String("database", offer.DatabaseName),
zap.String("new_node", offer.NodeID))
// Find a surviving node to provide join address
var joinAddr string
for _, nodeID := range dbMeta.NodeIDs {
if nodeID != cm.nodeID {
continue // Skip failed nodes (would need proper tracking)
}
ports := dbMeta.PortMappings[nodeID]
joinAddr = fmt.Sprintf("http://%s:%d", cm.getAdvertiseAddress(), ports.RaftPort)
break
}
// Broadcast confirmation
confirm := NodeReplacementConfirm{
DatabaseName: offer.DatabaseName,
NewNodeID: offer.NodeID,
ReplacedNodeID: "", // Would track which node failed
NewNodePorts: offer.AvailablePorts,
JoinAddress: joinAddr,
}
msgData, err := MarshalMetadataMessage(MsgNodeReplacementConfirm, cm.nodeID, confirm)
if err != nil {
return fmt.Errorf("failed to marshal replacement confirm: %w", err)
}
topic := "/debros/metadata/v1"
if err := cm.pubsubAdapter.Publish(cm.ctx, topic, msgData); err != nil {
return fmt.Errorf("failed to publish replacement confirm: %w", err)
}
return nil
}
// handleNodeReplacementConfirm processes confirmation of a replacement node
func (cm *ClusterManager) handleNodeReplacementConfirm(msg *MetadataMessage) error {
var confirm NodeReplacementConfirm
if err := msg.UnmarshalPayload(&confirm); err != nil {
return err
}
cm.logger.Info("Received node replacement confirm",
zap.String("database", confirm.DatabaseName),
zap.String("new_node", confirm.NewNodeID),
zap.String("replaced_node", confirm.ReplacedNodeID))
// Get database metadata
dbMeta := cm.metadataStore.GetDatabase(confirm.DatabaseName)
if dbMeta == nil {
cm.logger.Warn("Replacement confirm for unknown database",
zap.String("database", confirm.DatabaseName))
return nil
}
// Update metadata: replace old node with new node
newNodes := make([]string, 0, len(dbMeta.NodeIDs))
for _, nodeID := range dbMeta.NodeIDs {
if nodeID == confirm.ReplacedNodeID {
newNodes = append(newNodes, confirm.NewNodeID)
} else {
newNodes = append(newNodes, nodeID)
}
}
dbMeta.NodeIDs = newNodes
// Update port mappings
delete(dbMeta.PortMappings, confirm.ReplacedNodeID)
dbMeta.PortMappings[confirm.NewNodeID] = confirm.NewNodePorts
cm.metadataStore.SetDatabase(dbMeta)
// If we're the new node, start the instance and join
if confirm.NewNodeID == cm.nodeID {
cm.logger.Info("Starting as replacement node",
zap.String("database", confirm.DatabaseName))
go cm.startReplacementInstance(confirm.DatabaseName, confirm.NewNodePorts, confirm.JoinAddress)
}
return nil
}
// startReplacementInstance starts an instance as a replacement for a failed node
func (cm *ClusterManager) startReplacementInstance(dbName string, ports PortPair, joinAddr string) {
cm.logger.Info("Starting replacement instance",
zap.String("database", dbName),
zap.String("join_address", joinAddr))
// Create instance
instance := NewRQLiteInstance(
dbName,
ports,
cm.dataDir,
cm.getAdvertiseAddress(),
cm.getAdvertiseAddress(),
cm.logger,
)
// Start with join address (always joining existing cluster)
if err := instance.Start(cm.ctx, false, joinAddr); err != nil {
cm.logger.Error("Failed to start replacement instance", zap.Error(err))
cm.portManager.ReleasePortPair(ports)
return
}
// Add to active clusters
cm.mu.Lock()
cm.activeClusters[dbName] = instance
cm.mu.Unlock()
// Broadcast active status
cm.broadcastStatusUpdate(dbName, StatusActive)
cm.logger.Info("Replacement instance started successfully",
zap.String("database", dbName))
}

View File

@ -0,0 +1,519 @@
package rqlite
import (
"context"
"fmt"
"os"
"path/filepath"
"sync"
"time"
"github.com/DeBrosOfficial/network/pkg/config"
"github.com/DeBrosOfficial/network/pkg/pubsub"
"go.uber.org/zap"
)
// ClusterManager manages multiple RQLite database clusters on a single node
type ClusterManager struct {
nodeID string
config *config.DatabaseConfig
discoveryConfig *config.DiscoveryConfig
dataDir string
logger *zap.Logger
metadataStore *MetadataStore
activeClusters map[string]*RQLiteInstance // dbName -> instance
portManager *PortManager
pubsubAdapter *pubsub.ClientAdapter
coordinatorRegistry *CoordinatorRegistry
mu sync.RWMutex
ctx context.Context
cancel context.CancelFunc
}
// NewClusterManager creates a new cluster manager
func NewClusterManager(
nodeID string,
cfg *config.DatabaseConfig,
discoveryCfg *config.DiscoveryConfig,
dataDir string,
pubsubAdapter *pubsub.ClientAdapter,
logger *zap.Logger,
) *ClusterManager {
ctx, cancel := context.WithCancel(context.Background())
// Initialize port manager
portManager := NewPortManager(
PortRange{Start: cfg.PortRangeHTTPStart, End: cfg.PortRangeHTTPEnd},
PortRange{Start: cfg.PortRangeRaftStart, End: cfg.PortRangeRaftEnd},
)
return &ClusterManager{
nodeID: nodeID,
config: cfg,
discoveryConfig: discoveryCfg,
dataDir: dataDir,
logger: logger,
metadataStore: NewMetadataStore(),
activeClusters: make(map[string]*RQLiteInstance),
portManager: portManager,
pubsubAdapter: pubsubAdapter,
coordinatorRegistry: NewCoordinatorRegistry(),
ctx: ctx,
cancel: cancel,
}
}
// Start starts the cluster manager
func (cm *ClusterManager) Start() error {
cm.logger.Info("Starting cluster manager",
zap.String("node_id", cm.nodeID),
zap.Int("max_databases", cm.config.MaxDatabases))
// Subscribe to metadata topic
metadataTopic := "/debros/metadata/v1"
if err := cm.pubsubAdapter.Subscribe(cm.ctx, metadataTopic, cm.handleMetadataMessage); err != nil {
return fmt.Errorf("failed to subscribe to metadata topic: %w", err)
}
// Announce node capacity
go cm.announceCapacityPeriodically()
// Start health monitoring
go cm.monitorHealth()
// Start idle detection for hibernation
if cm.config.HibernationTimeout > 0 {
go cm.monitorIdleDatabases()
}
// Perform startup reconciliation
go cm.reconcileOrphanedData()
cm.logger.Info("Cluster manager started successfully")
return nil
}
// Stop stops the cluster manager
func (cm *ClusterManager) Stop() error {
cm.logger.Info("Stopping cluster manager")
cm.cancel()
// Stop all active clusters
cm.mu.Lock()
defer cm.mu.Unlock()
for dbName, instance := range cm.activeClusters {
cm.logger.Info("Stopping database instance",
zap.String("database", dbName))
if err := instance.Stop(); err != nil {
cm.logger.Warn("Error stopping database instance",
zap.String("database", dbName),
zap.Error(err))
}
}
cm.logger.Info("Cluster manager stopped")
return nil
}
// handleMetadataMessage processes incoming metadata messages
func (cm *ClusterManager) handleMetadataMessage(topic string, data []byte) error {
msg, err := UnmarshalMetadataMessage(data)
if err != nil {
// Silently ignore non-metadata messages (other pubsub traffic)
cm.logger.Debug("Ignoring non-metadata message on metadata topic", zap.Error(err))
return nil
}
// Skip messages from self
if msg.NodeID == cm.nodeID {
return nil
}
cm.logger.Debug("Received metadata message",
zap.String("type", string(msg.Type)),
zap.String("from", msg.NodeID))
switch msg.Type {
case MsgDatabaseCreateRequest:
return cm.handleCreateRequest(msg)
case MsgDatabaseCreateResponse:
return cm.handleCreateResponse(msg)
case MsgDatabaseCreateConfirm:
return cm.handleCreateConfirm(msg)
case MsgDatabaseStatusUpdate:
return cm.handleStatusUpdate(msg)
case MsgNodeCapacityAnnouncement:
return cm.handleCapacityAnnouncement(msg)
case MsgNodeHealthPing:
return cm.handleHealthPing(msg)
case MsgDatabaseIdleNotification:
return cm.handleIdleNotification(msg)
case MsgDatabaseShutdownCoordinated:
return cm.handleShutdownCoordinated(msg)
case MsgDatabaseWakeupRequest:
return cm.handleWakeupRequest(msg)
case MsgNodeReplacementNeeded:
return cm.handleNodeReplacementNeeded(msg)
case MsgNodeReplacementOffer:
return cm.handleNodeReplacementOffer(msg)
case MsgNodeReplacementConfirm:
return cm.handleNodeReplacementConfirm(msg)
case MsgMetadataSync:
return cm.handleMetadataSync(msg)
case MsgMetadataChecksumReq:
return cm.handleChecksumRequest(msg)
case MsgMetadataChecksumRes:
return cm.handleChecksumResponse(msg)
default:
cm.logger.Debug("Unhandled message type", zap.String("type", string(msg.Type)))
}
return nil
}
// CreateDatabase creates a new database cluster
func (cm *ClusterManager) CreateDatabase(dbName string, replicationFactor int) error {
cm.logger.Info("Initiating database creation",
zap.String("database", dbName),
zap.Int("replication_factor", replicationFactor))
// Check if database already exists
if existing := cm.metadataStore.GetDatabase(dbName); existing != nil {
return fmt.Errorf("database %s already exists", dbName)
}
// Create coordinator for this database creation
coordinator := NewCreateCoordinator(dbName, replicationFactor, cm.nodeID, cm.logger)
cm.coordinatorRegistry.Register(coordinator)
defer cm.coordinatorRegistry.Remove(dbName)
// Broadcast create request
req := DatabaseCreateRequest{
DatabaseName: dbName,
RequesterNodeID: cm.nodeID,
ReplicationFactor: replicationFactor,
}
msgData, err := MarshalMetadataMessage(MsgDatabaseCreateRequest, cm.nodeID, req)
if err != nil {
return fmt.Errorf("failed to marshal create request: %w", err)
}
topic := "/debros/metadata/v1"
if err := cm.pubsubAdapter.Publish(cm.ctx, topic, msgData); err != nil {
return fmt.Errorf("failed to publish create request: %w", err)
}
cm.logger.Info("Database create request broadcasted, waiting for responses",
zap.String("database", dbName))
// Wait for responses (2 seconds timeout)
waitCtx, cancel := context.WithTimeout(cm.ctx, 2*time.Second)
defer cancel()
if err := coordinator.WaitForResponses(waitCtx, 2*time.Second); err != nil {
cm.logger.Warn("Timeout waiting for responses", zap.String("database", dbName), zap.Error(err))
}
// Select nodes
responses := coordinator.GetResponses()
if len(responses) < replicationFactor {
return fmt.Errorf("insufficient nodes responded: got %d, need %d", len(responses), replicationFactor)
}
selectedResponses := coordinator.SelectNodes()
cm.logger.Info("Selected nodes for database",
zap.String("database", dbName),
zap.Int("count", len(selectedResponses)))
// Determine if this node is the coordinator (lowest ID among responders)
allNodeIDs := make([]string, len(selectedResponses))
for i, resp := range selectedResponses {
allNodeIDs[i] = resp.NodeID
}
coordinatorID := SelectCoordinator(allNodeIDs)
isCoordinator := coordinatorID == cm.nodeID
if isCoordinator {
cm.logger.Info("This node is coordinator, broadcasting confirmation",
zap.String("database", dbName))
// Build node assignments
assignments := make([]NodeAssignment, len(selectedResponses))
for i, resp := range selectedResponses {
role := "follower"
if i == 0 {
role = "leader"
}
assignments[i] = NodeAssignment{
NodeID: resp.NodeID,
HTTPPort: resp.AvailablePorts.HTTPPort,
RaftPort: resp.AvailablePorts.RaftPort,
Role: role,
}
}
// Broadcast confirmation
confirm := DatabaseCreateConfirm{
DatabaseName: dbName,
SelectedNodes: assignments,
CoordinatorNodeID: cm.nodeID,
}
msgData, err := MarshalMetadataMessage(MsgDatabaseCreateConfirm, cm.nodeID, confirm)
if err != nil {
return fmt.Errorf("failed to marshal create confirm: %w", err)
}
if err := cm.pubsubAdapter.Publish(cm.ctx, topic, msgData); err != nil {
return fmt.Errorf("failed to publish create confirm: %w", err)
}
cm.logger.Info("Database creation confirmation broadcasted",
zap.String("database", dbName))
}
return nil
}
// GetDatabase returns the RQLite instance for a database
func (cm *ClusterManager) GetDatabase(dbName string) *RQLiteInstance {
cm.mu.RLock()
defer cm.mu.RUnlock()
return cm.activeClusters[dbName]
}
// ListDatabases returns all active database names
func (cm *ClusterManager) ListDatabases() []string {
cm.mu.RLock()
defer cm.mu.RUnlock()
names := make([]string, 0, len(cm.activeClusters))
for name := range cm.activeClusters {
names = append(names, name)
}
return names
}
// GetMetadataStore returns the metadata store
func (cm *ClusterManager) GetMetadataStore() *MetadataStore {
return cm.metadataStore
}
// announceCapacityPeriodically announces node capacity every 30 seconds
func (cm *ClusterManager) announceCapacityPeriodically() {
ticker := time.NewTicker(30 * time.Second)
defer ticker.Stop()
// Announce immediately
cm.announceCapacity()
for {
select {
case <-cm.ctx.Done():
return
case <-ticker.C:
cm.announceCapacity()
}
}
}
// announceCapacity announces this node's capacity
func (cm *ClusterManager) announceCapacity() {
cm.mu.RLock()
currentDatabases := len(cm.activeClusters)
cm.mu.RUnlock()
announcement := NodeCapacityAnnouncement{
NodeID: cm.nodeID,
MaxDatabases: cm.config.MaxDatabases,
CurrentDatabases: currentDatabases,
PortRangeHTTP: PortRange{Start: cm.config.PortRangeHTTPStart, End: cm.config.PortRangeHTTPEnd},
PortRangeRaft: PortRange{Start: cm.config.PortRangeRaftStart, End: cm.config.PortRangeRaftEnd},
}
msgData, err := MarshalMetadataMessage(MsgNodeCapacityAnnouncement, cm.nodeID, announcement)
if err != nil {
cm.logger.Warn("Failed to marshal capacity announcement", zap.Error(err))
return
}
topic := "/debros/metadata/v1"
if err := cm.pubsubAdapter.Publish(cm.ctx, topic, msgData); err != nil {
cm.logger.Warn("Failed to publish capacity announcement", zap.Error(err))
return
}
// Update local metadata store
capacity := &NodeCapacity{
NodeID: cm.nodeID,
MaxDatabases: cm.config.MaxDatabases,
CurrentDatabases: currentDatabases,
PortRangeHTTP: announcement.PortRangeHTTP,
PortRangeRaft: announcement.PortRangeRaft,
LastHealthCheck: time.Now(),
IsHealthy: true,
}
cm.metadataStore.SetNode(capacity)
}
// monitorHealth monitors the health of active databases
func (cm *ClusterManager) monitorHealth() {
ticker := time.NewTicker(cm.discoveryConfig.HealthCheckInterval)
defer ticker.Stop()
for {
select {
case <-cm.ctx.Done():
return
case <-ticker.C:
cm.checkDatabaseHealth()
}
}
}
// checkDatabaseHealth checks if all active databases are healthy
func (cm *ClusterManager) checkDatabaseHealth() {
cm.mu.RLock()
defer cm.mu.RUnlock()
for dbName, instance := range cm.activeClusters {
if !instance.IsRunning() {
cm.logger.Warn("Database instance is not running",
zap.String("database", dbName))
// TODO: Implement recovery logic
}
}
}
// monitorIdleDatabases monitors for idle databases to hibernate
func (cm *ClusterManager) monitorIdleDatabases() {
ticker := time.NewTicker(10 * time.Second)
defer ticker.Stop()
for {
select {
case <-cm.ctx.Done():
return
case <-ticker.C:
cm.detectIdleDatabases()
}
}
}
// detectIdleDatabases detects idle databases and broadcasts idle notifications
func (cm *ClusterManager) detectIdleDatabases() {
cm.mu.RLock()
defer cm.mu.RUnlock()
for dbName, instance := range cm.activeClusters {
if instance.IsIdle(cm.config.HibernationTimeout) && instance.Status == StatusActive {
cm.logger.Debug("Database is idle",
zap.String("database", dbName),
zap.Duration("idle_time", time.Since(instance.LastQuery)))
// Broadcast idle notification
notification := DatabaseIdleNotification{
DatabaseName: dbName,
NodeID: cm.nodeID,
LastActivity: instance.LastQuery,
}
msgData, err := MarshalMetadataMessage(MsgDatabaseIdleNotification, cm.nodeID, notification)
if err != nil {
cm.logger.Warn("Failed to marshal idle notification", zap.Error(err))
continue
}
topic := "/debros/metadata/v1"
if err := cm.pubsubAdapter.Publish(cm.ctx, topic, msgData); err != nil {
cm.logger.Warn("Failed to publish idle notification", zap.Error(err))
}
}
}
}
// reconcileOrphanedData checks for orphaned database directories
func (cm *ClusterManager) reconcileOrphanedData() {
// Wait a bit for metadata to sync
time.Sleep(10 * time.Second)
cm.logger.Info("Starting orphaned data reconciliation")
// Read data directory
entries, err := os.ReadDir(cm.dataDir)
if err != nil {
cm.logger.Error("Failed to read data directory for reconciliation", zap.Error(err))
return
}
orphanCount := 0
for _, entry := range entries {
if !entry.IsDir() {
continue
}
dbName := entry.Name()
// Skip special directories
if dbName == "rqlite" || dbName == "." || dbName == ".." {
continue
}
// Check if this database exists in metadata
dbMeta := cm.metadataStore.GetDatabase(dbName)
if dbMeta == nil {
// Orphaned directory - no metadata exists
cm.logger.Warn("Found orphaned database directory",
zap.String("database", dbName))
orphanCount++
// Delete the orphaned directory
dbPath := filepath.Join(cm.dataDir, dbName)
if err := os.RemoveAll(dbPath); err != nil {
cm.logger.Error("Failed to remove orphaned directory",
zap.String("database", dbName),
zap.String("path", dbPath),
zap.Error(err))
} else {
cm.logger.Info("Removed orphaned database directory",
zap.String("database", dbName))
}
continue
}
// Check if this node is a member of the database
isMember := false
for _, nodeID := range dbMeta.NodeIDs {
if nodeID == cm.nodeID {
isMember = true
break
}
}
if !isMember {
// This node is not a member - orphaned data
cm.logger.Warn("Found database directory for non-member database",
zap.String("database", dbName))
orphanCount++
dbPath := filepath.Join(cm.dataDir, dbName)
if err := os.RemoveAll(dbPath); err != nil {
cm.logger.Error("Failed to remove non-member directory",
zap.String("database", dbName),
zap.String("path", dbPath),
zap.Error(err))
} else {
cm.logger.Info("Removed non-member database directory",
zap.String("database", dbName))
}
}
}
cm.logger.Info("Orphaned data reconciliation complete",
zap.Int("orphans_found", orphanCount))
}

180
pkg/rqlite/consensus.go Normal file
View File

@ -0,0 +1,180 @@
package rqlite
import (
"crypto/sha256"
"encoding/hex"
"encoding/json"
"sort"
"time"
)
// SelectCoordinator deterministically selects a coordinator from a list of node IDs
// Uses lexicographic ordering (lowest ID wins)
func SelectCoordinator(nodeIDs []string) string {
if len(nodeIDs) == 0 {
return ""
}
sorted := make([]string, len(nodeIDs))
copy(sorted, nodeIDs)
sort.Strings(sorted)
return sorted[0]
}
// ResolveConflict resolves a conflict between two database metadata entries
// Returns the winning metadata entry
func ResolveConflict(local, remote *DatabaseMetadata) *DatabaseMetadata {
if local == nil {
return remote
}
if remote == nil {
return local
}
// Compare vector clocks
localVC := VectorClock(local.VectorClock)
remoteVC := VectorClock(remote.VectorClock)
comparison := localVC.Compare(remoteVC)
if comparison == -1 {
// Local happens before remote, remote wins
return remote
} else if comparison == 1 {
// Remote happens before local, local wins
return local
}
// Concurrent: use version number as tiebreaker
if remote.Version > local.Version {
return remote
} else if local.Version > remote.Version {
return local
}
// Same version: use timestamp as tiebreaker
if remote.CreatedAt.After(local.CreatedAt) {
return remote
} else if local.CreatedAt.After(remote.CreatedAt) {
return local
}
// Same timestamp: use lexicographic comparison of database name
if remote.DatabaseName < local.DatabaseName {
return remote
}
return local
}
// MetadataChecksum represents a checksum of database metadata
type MetadataChecksum struct {
DatabaseName string `json:"database_name"`
Version uint64 `json:"version"`
Hash string `json:"hash"`
}
// ComputeMetadataChecksum computes a checksum for database metadata
func ComputeMetadataChecksum(db *DatabaseMetadata) MetadataChecksum {
if db == nil {
return MetadataChecksum{}
}
// Create a canonical representation for hashing
canonical := struct {
DatabaseName string
NodeIDs []string
PortMappings map[string]PortPair
Status DatabaseStatus
}{
DatabaseName: db.DatabaseName,
NodeIDs: make([]string, len(db.NodeIDs)),
PortMappings: db.PortMappings,
Status: db.Status,
}
// Sort node IDs for deterministic hashing
copy(canonical.NodeIDs, db.NodeIDs)
sort.Strings(canonical.NodeIDs)
// Serialize and hash
data, _ := json.Marshal(canonical)
hash := sha256.Sum256(data)
return MetadataChecksum{
DatabaseName: db.DatabaseName,
Version: db.Version,
Hash: hex.EncodeToString(hash[:]),
}
}
// ComputeFullStateChecksum computes checksums for all databases in the store
func ComputeFullStateChecksum(store *MetadataStore) []MetadataChecksum {
checksums := make([]MetadataChecksum, 0)
for _, name := range store.ListDatabases() {
if db := store.GetDatabase(name); db != nil {
checksums = append(checksums, ComputeMetadataChecksum(db))
}
}
// Sort by database name for deterministic ordering
sort.Slice(checksums, func(i, j int) bool {
return checksums[i].DatabaseName < checksums[j].DatabaseName
})
return checksums
}
// SelectNodesForDatabase selects N nodes from the list of healthy nodes
// Returns up to replicationFactor nodes
func SelectNodesForDatabase(healthyNodes []string, replicationFactor int) []string {
if len(healthyNodes) == 0 {
return []string{}
}
// Sort for deterministic selection
sorted := make([]string, len(healthyNodes))
copy(sorted, healthyNodes)
sort.Strings(sorted)
// Select first N nodes
n := replicationFactor
if n > len(sorted) {
n = len(sorted)
}
return sorted[:n]
}
// IsNodeInCluster checks if a node is part of a database cluster
func IsNodeInCluster(nodeID string, db *DatabaseMetadata) bool {
if db == nil {
return false
}
for _, id := range db.NodeIDs {
if id == nodeID {
return true
}
}
return false
}
// UpdateDatabaseMetadata updates metadata with vector clock and version increment
func UpdateDatabaseMetadata(db *DatabaseMetadata, nodeID string) {
if db.VectorClock == nil {
db.VectorClock = NewVectorClock()
}
// Increment vector clock for this node
vc := VectorClock(db.VectorClock)
vc.Increment(nodeID)
// Increment version
db.Version++
// Update last accessed time
db.LastAccessed = time.Now()
}

145
pkg/rqlite/coordinator.go Normal file
View File

@ -0,0 +1,145 @@
package rqlite
import (
"context"
"fmt"
"sort"
"sync"
"time"
"go.uber.org/zap"
)
// CreateCoordinator coordinates the database creation process
type CreateCoordinator struct {
dbName string
replicationFactor int
requesterID string
responses []DatabaseCreateResponse
mu sync.Mutex
logger *zap.Logger
}
// NewCreateCoordinator creates a new coordinator for database creation
func NewCreateCoordinator(dbName string, replicationFactor int, requesterID string, logger *zap.Logger) *CreateCoordinator {
return &CreateCoordinator{
dbName: dbName,
replicationFactor: replicationFactor,
requesterID: requesterID,
responses: make([]DatabaseCreateResponse, 0),
logger: logger,
}
}
// AddResponse adds a response from a node
func (cc *CreateCoordinator) AddResponse(response DatabaseCreateResponse) {
cc.mu.Lock()
defer cc.mu.Unlock()
cc.responses = append(cc.responses, response)
}
// GetResponses returns all collected responses
func (cc *CreateCoordinator) GetResponses() []DatabaseCreateResponse {
cc.mu.Lock()
defer cc.mu.Unlock()
return append([]DatabaseCreateResponse(nil), cc.responses...)
}
// ResponseCount returns the number of responses received
func (cc *CreateCoordinator) ResponseCount() int {
cc.mu.Lock()
defer cc.mu.Unlock()
return len(cc.responses)
}
// SelectNodes selects the best nodes for the database cluster
func (cc *CreateCoordinator) SelectNodes() []DatabaseCreateResponse {
cc.mu.Lock()
defer cc.mu.Unlock()
if len(cc.responses) < cc.replicationFactor {
cc.logger.Warn("Insufficient responses for database creation",
zap.String("database", cc.dbName),
zap.Int("required", cc.replicationFactor),
zap.Int("received", len(cc.responses)))
// Return what we have if less than required
return cc.responses
}
// Sort responses by node ID for deterministic selection
sorted := make([]DatabaseCreateResponse, len(cc.responses))
copy(sorted, cc.responses)
sort.Slice(sorted, func(i, j int) bool {
return sorted[i].NodeID < sorted[j].NodeID
})
// Select first N nodes
return sorted[:cc.replicationFactor]
}
// WaitForResponses waits for responses with a timeout
func (cc *CreateCoordinator) WaitForResponses(ctx context.Context, timeout time.Duration) error {
deadline := time.Now().Add(timeout)
ticker := time.NewTicker(100 * time.Millisecond)
defer ticker.Stop()
for {
select {
case <-ctx.Done():
return ctx.Err()
case <-ticker.C:
if time.Now().After(deadline) {
return fmt.Errorf("timeout waiting for responses")
}
if cc.ResponseCount() >= cc.replicationFactor {
return nil
}
}
}
}
// CoordinatorRegistry manages active coordinators for database creation
type CoordinatorRegistry struct {
coordinators map[string]*CreateCoordinator // dbName -> coordinator
mu sync.RWMutex
}
// NewCoordinatorRegistry creates a new coordinator registry
func NewCoordinatorRegistry() *CoordinatorRegistry {
return &CoordinatorRegistry{
coordinators: make(map[string]*CreateCoordinator),
}
}
// Register registers a new coordinator
func (cr *CoordinatorRegistry) Register(coordinator *CreateCoordinator) {
cr.mu.Lock()
defer cr.mu.Unlock()
cr.coordinators[coordinator.dbName] = coordinator
}
// Get retrieves a coordinator by database name
func (cr *CoordinatorRegistry) Get(dbName string) *CreateCoordinator {
cr.mu.RLock()
defer cr.mu.RUnlock()
return cr.coordinators[dbName]
}
// Remove removes a coordinator
func (cr *CoordinatorRegistry) Remove(dbName string) {
cr.mu.Lock()
defer cr.mu.Unlock()
delete(cr.coordinators, dbName)
}
// HandleCreateResponse handles a CREATE_RESPONSE message
func (cr *CoordinatorRegistry) HandleCreateResponse(response DatabaseCreateResponse) {
cr.mu.RLock()
coordinator := cr.coordinators[response.DatabaseName]
cr.mu.RUnlock()
if coordinator != nil {
coordinator.AddResponse(response)
}
}

View File

@ -1,615 +0,0 @@
package rqlite
// HTTP gateway for the rqlite ORM client.
//
// This file exposes a minimal, SDK-friendly HTTP interface over the ORM-like
// client defined in client.go. It maps high-level operations (Query, Exec,
// FindBy, FindOneBy, QueryBuilder-based SELECTs, Transactions) and a few schema
// helpers into JSON-over-HTTP endpoints that can be called from any language.
//
// Endpoints (under BasePath, default: /v1/db):
// - POST {base}/query -> arbitrary SELECT; returns rows as []map[string]any
// - POST {base}/exec -> write statement (INSERT/UPDATE/DELETE/DDL); returns {rows_affected,last_insert_id}
// - POST {base}/find -> FindBy(table, criteria, opts...) -> returns []map
// - POST {base}/find-one -> FindOneBy(table, criteria, opts...) -> returns map
// - POST {base}/select -> Fluent SELECT builder via JSON (joins, where, order, group, limit, offset); returns []map or one map if one=true
// - POST {base}/transaction -> Execute a sequence of exec/query ops atomically; optionally return results
//
// Schema helpers (convenience; powered via Exec/Query):
// - GET {base}/schema -> list of user tables/views and create SQL
// - POST {base}/create-table -> {schema: "CREATE TABLE ..."} -> status ok
// - POST {base}/drop-table -> {table: "name"} -> status ok (safe-validated identifier)
//
// Notes:
// - All numbers in JSON are decoded as float64 by default; we best-effort coerce
// integral values to int64 for SQL placeholders.
// - The Save/Remove reflection helpers in the ORM require concrete Go structs;
// exposing them generically over HTTP is not portable. Prefer using the Exec
// and Find APIs, or the Select builder for CRUD-like flows.
import (
"context"
"database/sql"
"encoding/json"
"errors"
"fmt"
"net/http"
"regexp"
"strings"
"time"
)
// HTTPGateway exposes the ORM Client as a set of HTTP handlers.
type HTTPGateway struct {
// Client is the ORM-like rqlite client to execute operations against.
Client Client
// BasePath is the prefix for all routes, e.g. "/v1/db".
// If empty, defaults to "/v1/db". A trailing slash is trimmed.
BasePath string
// Optional: Request timeout. If > 0, handlers will use a context with this timeout.
Timeout time.Duration
}
// NewHTTPGateway constructs a new HTTPGateway with sensible defaults.
func NewHTTPGateway(c Client, base string) *HTTPGateway {
return &HTTPGateway{
Client: c,
BasePath: base,
}
}
// RegisterRoutes registers all handlers onto the provided mux under BasePath.
func (g *HTTPGateway) RegisterRoutes(mux *http.ServeMux) {
base := g.base()
mux.HandleFunc(base+"/query", g.handleQuery)
mux.HandleFunc(base+"/exec", g.handleExec)
mux.HandleFunc(base+"/find", g.handleFind)
mux.HandleFunc(base+"/find-one", g.handleFindOne)
mux.HandleFunc(base+"/select", g.handleSelect)
// Keep "transaction" for compatibility with existing routes.
mux.HandleFunc(base+"/transaction", g.handleTransaction)
// Schema helpers
mux.HandleFunc(base+"/schema", g.handleSchema)
mux.HandleFunc(base+"/create-table", g.handleCreateTable)
mux.HandleFunc(base+"/drop-table", g.handleDropTable)
}
func (g *HTTPGateway) base() string {
b := strings.TrimSpace(g.BasePath)
if b == "" {
b = "/v1/db"
}
if b != "/" {
b = strings.TrimRight(b, "/")
}
return b
}
func (g *HTTPGateway) withTimeout(ctx context.Context) (context.Context, context.CancelFunc) {
if g.Timeout > 0 {
return context.WithTimeout(ctx, g.Timeout)
}
return context.WithCancel(ctx)
}
// --------------------
// Common HTTP helpers
// --------------------
func writeJSON(w http.ResponseWriter, code int, v any) {
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(code)
_ = json.NewEncoder(w).Encode(v)
}
func writeError(w http.ResponseWriter, code int, msg string) {
writeJSON(w, code, map[string]any{"error": msg})
}
func onlyMethod(w http.ResponseWriter, r *http.Request, method string) bool {
if r.Method != method {
writeError(w, http.StatusMethodNotAllowed, "method not allowed")
return false
}
return true
}
// Normalize JSON-decoded args for SQL placeholders.
// - Convert float64 with integral value to int64 to better match SQLite expectations.
// - Leave strings, bools and nulls as-is.
// - Recursively normalizes nested arrays if present.
func normalizeArgs(args []any) []any {
out := make([]any, len(args))
for i, a := range args {
switch v := a.(type) {
case float64:
// If v is integral (within epsilon), convert to int64
if v == float64(int64(v)) {
out[i] = int64(v)
} else {
out[i] = v
}
case []any:
out[i] = normalizeArgs(v)
default:
out[i] = a
}
}
return out
}
// --------------------
// Request DTOs
// --------------------
type queryRequest struct {
SQL string `json:"sql"`
Args []any `json:"args"`
}
type execRequest struct {
SQL string `json:"sql"`
Args []any `json:"args"`
}
type findOptions struct {
Select []string `json:"select"`
OrderBy []string `json:"order_by"`
GroupBy []string `json:"group_by"`
Limit *int `json:"limit"`
Offset *int `json:"offset"`
Joins []joinBody `json:"joins"`
}
type findRequest struct {
Table string `json:"table"`
Criteria map[string]any `json:"criteria"`
Options findOptions `json:"options"`
// Back-compat: allow options at top-level too
Select []string `json:"select"`
OrderBy []string `json:"order_by"`
GroupBy []string `json:"group_by"`
Limit *int `json:"limit"`
Offset *int `json:"offset"`
Joins []joinBody `json:"joins"`
}
type findOneRequest = findRequest
type joinBody struct {
Kind string `json:"kind"` // "INNER" | "LEFT" | "JOIN"
Table string `json:"table"` // table name
On string `json:"on"` // join condition
}
type whereBody struct {
Conj string `json:"conj"` // "AND" | "OR" (default AND)
Expr string `json:"expr"` // e.g., "a = ? AND b > ?"
Args []any `json:"args"`
}
type selectRequest struct {
Table string `json:"table"`
Alias string `json:"alias"`
Select []string `json:"select"`
Joins []joinBody `json:"joins"`
Where []whereBody `json:"where"`
GroupBy []string `json:"group_by"`
OrderBy []string `json:"order_by"`
Limit *int `json:"limit"`
Offset *int `json:"offset"`
One bool `json:"one"` // if true, returns a single row (object)
}
type txOp struct {
Kind string `json:"kind"` // "exec" | "query"
SQL string `json:"sql"`
Args []any `json:"args"`
}
type transactionRequest struct {
Ops []txOp `json:"ops"`
ReturnResults bool `json:"return_results"` // if true, returns per-op results
StopOnError bool `json:"stop_on_error"` // default true in tx
PartialResults bool `json:"partial_results"` // ignored for actual TX (atomic); kept for API symmetry
}
// --------------------
// Handlers
// --------------------
func (g *HTTPGateway) handleQuery(w http.ResponseWriter, r *http.Request) {
if !onlyMethod(w, r, http.MethodPost) {
return
}
if g.Client == nil {
writeError(w, http.StatusServiceUnavailable, "client not initialized")
return
}
var body queryRequest
if err := json.NewDecoder(r.Body).Decode(&body); err != nil || strings.TrimSpace(body.SQL) == "" {
writeError(w, http.StatusBadRequest, "invalid body: {sql, args?}")
return
}
args := normalizeArgs(body.Args)
ctx, cancel := g.withTimeout(r.Context())
defer cancel()
out := make([]map[string]any, 0, 16)
if err := g.Client.Query(ctx, &out, body.SQL, args...); err != nil {
writeError(w, http.StatusInternalServerError, err.Error())
return
}
writeJSON(w, http.StatusOK, map[string]any{
"items": out,
"count": len(out),
})
}
func (g *HTTPGateway) handleExec(w http.ResponseWriter, r *http.Request) {
if !onlyMethod(w, r, http.MethodPost) {
return
}
if g.Client == nil {
writeError(w, http.StatusServiceUnavailable, "client not initialized")
return
}
var body execRequest
if err := json.NewDecoder(r.Body).Decode(&body); err != nil || strings.TrimSpace(body.SQL) == "" {
writeError(w, http.StatusBadRequest, "invalid body: {sql, args?}")
return
}
args := normalizeArgs(body.Args)
ctx, cancel := g.withTimeout(r.Context())
defer cancel()
res, err := g.Client.Exec(ctx, body.SQL, args...)
if err != nil {
writeError(w, http.StatusInternalServerError, err.Error())
return
}
liid, _ := res.LastInsertId()
ra, _ := res.RowsAffected()
writeJSON(w, http.StatusOK, map[string]any{
"rows_affected": ra,
"last_insert_id": liid,
"execution_state": "ok",
})
}
func (g *HTTPGateway) handleFind(w http.ResponseWriter, r *http.Request) {
if !onlyMethod(w, r, http.MethodPost) {
return
}
if g.Client == nil {
writeError(w, http.StatusServiceUnavailable, "client not initialized")
return
}
var body findRequest
if err := json.NewDecoder(r.Body).Decode(&body); err != nil || strings.TrimSpace(body.Table) == "" {
writeError(w, http.StatusBadRequest, "invalid body: {table, criteria, options?}")
return
}
opts := makeFindOptions(mergeFindOptions(body))
ctx, cancel := g.withTimeout(r.Context())
defer cancel()
out := make([]map[string]any, 0, 32)
if err := g.Client.FindBy(ctx, &out, body.Table, body.Criteria, opts...); err != nil {
writeError(w, http.StatusInternalServerError, err.Error())
return
}
writeJSON(w, http.StatusOK, map[string]any{
"items": out,
"count": len(out),
})
}
func (g *HTTPGateway) handleFindOne(w http.ResponseWriter, r *http.Request) {
if !onlyMethod(w, r, http.MethodPost) {
return
}
if g.Client == nil {
writeError(w, http.StatusServiceUnavailable, "client not initialized")
return
}
var body findOneRequest
if err := json.NewDecoder(r.Body).Decode(&body); err != nil || strings.TrimSpace(body.Table) == "" {
writeError(w, http.StatusBadRequest, "invalid body: {table, criteria, options?}")
return
}
opts := makeFindOptions(mergeFindOptions(body))
ctx, cancel := g.withTimeout(r.Context())
defer cancel()
row := make(map[string]any)
if err := g.Client.FindOneBy(ctx, &row, body.Table, body.Criteria, opts...); err != nil {
if errors.Is(err, sql.ErrNoRows) {
writeError(w, http.StatusNotFound, "not found")
return
}
writeError(w, http.StatusInternalServerError, err.Error())
return
}
writeJSON(w, http.StatusOK, row)
}
func (g *HTTPGateway) handleSelect(w http.ResponseWriter, r *http.Request) {
if !onlyMethod(w, r, http.MethodPost) {
return
}
if g.Client == nil {
writeError(w, http.StatusServiceUnavailable, "client not initialized")
return
}
var body selectRequest
if err := json.NewDecoder(r.Body).Decode(&body); err != nil || strings.TrimSpace(body.Table) == "" {
writeError(w, http.StatusBadRequest, "invalid body: {table, select?, where?, joins?, order_by?, group_by?, limit?, offset?, one?}")
return
}
ctx, cancel := g.withTimeout(r.Context())
defer cancel()
qb := g.Client.CreateQueryBuilder(body.Table)
if alias := strings.TrimSpace(body.Alias); alias != "" {
qb = qb.Alias(alias)
}
if len(body.Select) > 0 {
qb = qb.Select(body.Select...)
}
// joins
for _, j := range body.Joins {
switch strings.ToUpper(strings.TrimSpace(j.Kind)) {
case "INNER":
qb = qb.InnerJoin(j.Table, j.On)
case "LEFT":
qb = qb.LeftJoin(j.Table, j.On)
default:
qb = qb.Join(j.Table, j.On)
}
}
// where
for _, wcl := range body.Where {
switch strings.ToUpper(strings.TrimSpace(wcl.Conj)) {
case "OR":
qb = qb.OrWhere(wcl.Expr, normalizeArgs(wcl.Args)...)
default:
qb = qb.AndWhere(wcl.Expr, normalizeArgs(wcl.Args)...)
}
}
// group/order/limit/offset
if len(body.GroupBy) > 0 {
qb = qb.GroupBy(body.GroupBy...)
}
if len(body.OrderBy) > 0 {
qb = qb.OrderBy(body.OrderBy...)
}
if body.Limit != nil {
qb = qb.Limit(*body.Limit)
}
if body.Offset != nil {
qb = qb.Offset(*body.Offset)
}
if body.One {
row := make(map[string]any)
if err := qb.GetOne(ctx, &row); err != nil {
if errors.Is(err, sql.ErrNoRows) {
writeError(w, http.StatusNotFound, "not found")
return
}
writeError(w, http.StatusInternalServerError, err.Error())
return
}
writeJSON(w, http.StatusOK, row)
return
}
rows := make([]map[string]any, 0, 32)
if err := qb.GetMany(ctx, &rows); err != nil {
writeError(w, http.StatusInternalServerError, err.Error())
return
}
writeJSON(w, http.StatusOK, map[string]any{
"items": rows,
"count": len(rows),
})
}
func (g *HTTPGateway) handleTransaction(w http.ResponseWriter, r *http.Request) {
if !onlyMethod(w, r, http.MethodPost) {
return
}
if g.Client == nil {
writeError(w, http.StatusServiceUnavailable, "client not initialized")
return
}
var body transactionRequest
if err := json.NewDecoder(r.Body).Decode(&body); err != nil || len(body.Ops) == 0 {
writeError(w, http.StatusBadRequest, "invalid body: {ops:[{kind,sql,args?}], return_results?}")
return
}
ctx, cancel := g.withTimeout(r.Context())
defer cancel()
results := make([]any, 0, len(body.Ops))
err := g.Client.Tx(ctx, func(tx Tx) error {
for _, op := range body.Ops {
switch strings.ToLower(strings.TrimSpace(op.Kind)) {
case "exec":
res, err := tx.Exec(ctx, op.SQL, normalizeArgs(op.Args)...)
if err != nil {
return err
}
if body.ReturnResults {
li, _ := res.LastInsertId()
ra, _ := res.RowsAffected()
results = append(results, map[string]any{
"rows_affected": ra,
"last_insert_id": li,
})
}
case "query":
var rows []map[string]any
if err := tx.Query(ctx, &rows, op.SQL, normalizeArgs(op.Args)...); err != nil {
return err
}
if body.ReturnResults {
results = append(results, rows)
}
default:
return fmt.Errorf("invalid op kind: %s", op.Kind)
}
}
return nil
})
if err != nil {
writeError(w, http.StatusInternalServerError, err.Error())
return
}
if body.ReturnResults {
writeJSON(w, http.StatusOK, map[string]any{
"status": "ok",
"results": results,
})
return
}
writeJSON(w, http.StatusOK, map[string]any{"status": "ok"})
}
// --------------------
// Schema helpers
// --------------------
func (g *HTTPGateway) handleSchema(w http.ResponseWriter, r *http.Request) {
if !onlyMethod(w, r, http.MethodGet) {
return
}
if g.Client == nil {
writeError(w, http.StatusServiceUnavailable, "client not initialized")
return
}
ctx, cancel := g.withTimeout(r.Context())
defer cancel()
sqlText := `SELECT name, type, sql FROM sqlite_master WHERE type IN ('table','view') AND name NOT LIKE 'sqlite_%' ORDER BY name`
var rows []map[string]any
if err := g.Client.Query(ctx, &rows, sqlText); err != nil {
writeError(w, http.StatusInternalServerError, err.Error())
return
}
writeJSON(w, http.StatusOK, map[string]any{
"objects": rows,
"count": len(rows),
})
}
func (g *HTTPGateway) handleCreateTable(w http.ResponseWriter, r *http.Request) {
if !onlyMethod(w, r, http.MethodPost) {
return
}
if g.Client == nil {
writeError(w, http.StatusServiceUnavailable, "client not initialized")
return
}
var body struct {
Schema string `json:"schema"`
}
if err := json.NewDecoder(r.Body).Decode(&body); err != nil || strings.TrimSpace(body.Schema) == "" {
writeError(w, http.StatusBadRequest, "invalid body: {schema}")
return
}
ctx, cancel := g.withTimeout(r.Context())
defer cancel()
if _, err := g.Client.Exec(ctx, body.Schema); err != nil {
writeError(w, http.StatusInternalServerError, err.Error())
return
}
writeJSON(w, http.StatusCreated, map[string]any{"status": "ok"})
}
var identRe = regexp.MustCompile(`^[A-Za-z_][A-Za-z0-9_]*$`)
func (g *HTTPGateway) handleDropTable(w http.ResponseWriter, r *http.Request) {
if !onlyMethod(w, r, http.MethodPost) {
return
}
if g.Client == nil {
writeError(w, http.StatusServiceUnavailable, "client not initialized")
return
}
var body struct {
Table string `json:"table"`
}
if err := json.NewDecoder(r.Body).Decode(&body); err != nil || strings.TrimSpace(body.Table) == "" {
writeError(w, http.StatusBadRequest, "invalid body: {table}")
return
}
tbl := strings.TrimSpace(body.Table)
if !identRe.MatchString(tbl) {
writeError(w, http.StatusBadRequest, "invalid table identifier")
return
}
ctx, cancel := g.withTimeout(r.Context())
defer cancel()
stmt := "DROP TABLE IF EXISTS " + tbl
if _, err := g.Client.Exec(ctx, stmt); err != nil {
writeError(w, http.StatusInternalServerError, err.Error())
return
}
writeJSON(w, http.StatusOK, map[string]any{"status": "ok"})
}
// --------------------
// Helpers
// --------------------
func mergeFindOptions(fr findRequest) findOptions {
// Prefer nested Options; fallback to top-level legacy fields
if (len(fr.Options.Select)+len(fr.Options.OrderBy)+len(fr.Options.GroupBy)) > 0 ||
fr.Options.Limit != nil || fr.Options.Offset != nil || len(fr.Options.Joins) > 0 {
return fr.Options
}
return findOptions{
Select: fr.Select,
OrderBy: fr.OrderBy,
GroupBy: fr.GroupBy,
Limit: fr.Limit,
Offset: fr.Offset,
Joins: fr.Joins,
}
}
func makeFindOptions(o findOptions) []FindOption {
opts := make([]FindOption, 0, 6)
if len(o.OrderBy) > 0 {
opts = append(opts, WithOrderBy(o.OrderBy...))
}
if len(o.GroupBy) > 0 {
opts = append(opts, WithGroupBy(o.GroupBy...))
}
if o.Limit != nil {
opts = append(opts, WithLimit(*o.Limit))
}
if o.Offset != nil {
opts = append(opts, WithOffset(*o.Offset))
}
if len(o.Select) > 0 {
opts = append(opts, WithSelect(o.Select...))
}
for _, j := range o.Joins {
opts = append(opts, WithJoin(justOrDefault(strings.ToUpper(j.Kind), "JOIN"), j.Table, j.On))
}
return opts
}
func justOrDefault(s, def string) string {
if strings.TrimSpace(s) == "" {
return def
}
return s
}

240
pkg/rqlite/instance.go Normal file
View File

@ -0,0 +1,240 @@
package rqlite
import (
"context"
"errors"
"fmt"
"net/http"
"os"
"os/exec"
"path/filepath"
"syscall"
"time"
"github.com/rqlite/gorqlite"
"go.uber.org/zap"
)
// RQLiteInstance represents a single rqlite database instance
type RQLiteInstance struct {
DatabaseName string
HTTPPort int
RaftPort int
DataDir string
AdvHTTPAddr string // Advertised HTTP address
AdvRaftAddr string // Advertised Raft address
Cmd *exec.Cmd
Connection *gorqlite.Connection
LastQuery time.Time
Status DatabaseStatus
logger *zap.Logger
}
// NewRQLiteInstance creates a new RQLite instance configuration
func NewRQLiteInstance(dbName string, ports PortPair, dataDir string, advHTTPAddr, advRaftAddr string, logger *zap.Logger) *RQLiteInstance {
return &RQLiteInstance{
DatabaseName: dbName,
HTTPPort: ports.HTTPPort,
RaftPort: ports.RaftPort,
DataDir: filepath.Join(dataDir, dbName, "rqlite"),
AdvHTTPAddr: advHTTPAddr,
AdvRaftAddr: advRaftAddr,
Status: StatusInitializing,
logger: logger,
}
}
// Start starts the rqlite subprocess
func (ri *RQLiteInstance) Start(ctx context.Context, isLeader bool, joinAddr string) error {
// Create data directory
if err := os.MkdirAll(ri.DataDir, 0755); err != nil {
return fmt.Errorf("failed to create data directory: %w", err)
}
// Build rqlited command
args := []string{
"-http-addr", fmt.Sprintf("0.0.0.0:%d", ri.HTTPPort),
"-raft-addr", fmt.Sprintf("0.0.0.0:%d", ri.RaftPort),
}
// Add advertised addresses if provided
if ri.AdvHTTPAddr != "" {
args = append(args, "-http-adv-addr", ri.AdvHTTPAddr)
}
if ri.AdvRaftAddr != "" {
args = append(args, "-raft-adv-addr", ri.AdvRaftAddr)
}
// Add join address if this is a follower
if !isLeader && joinAddr != "" {
args = append(args, "-join", joinAddr)
}
// Add data directory as positional argument
args = append(args, ri.DataDir)
ri.logger.Info("Starting RQLite instance",
zap.String("database", ri.DatabaseName),
zap.Int("http_port", ri.HTTPPort),
zap.Int("raft_port", ri.RaftPort),
zap.String("data_dir", ri.DataDir),
zap.Bool("is_leader", isLeader),
zap.Strings("args", args))
// Start RQLite process
ri.Cmd = exec.Command("rqlited", args...)
// Optionally capture stdout/stderr for debugging
// ri.Cmd.Stdout = os.Stdout
// ri.Cmd.Stderr = os.Stderr
if err := ri.Cmd.Start(); err != nil {
return fmt.Errorf("failed to start rqlited: %w", err)
}
// Wait for RQLite to be ready
if err := ri.waitForReady(ctx); err != nil {
ri.Stop()
return fmt.Errorf("rqlited failed to become ready: %w", err)
}
// Create connection
conn, err := gorqlite.Open(fmt.Sprintf("http://localhost:%d", ri.HTTPPort))
if err != nil {
ri.Stop()
return fmt.Errorf("failed to connect to rqlited: %w", err)
}
ri.Connection = conn
// Wait for SQL availability
if err := ri.waitForSQLAvailable(ctx); err != nil {
ri.Stop()
return fmt.Errorf("rqlited SQL not available: %w", err)
}
ri.Status = StatusActive
ri.LastQuery = time.Now()
ri.logger.Info("RQLite instance started successfully",
zap.String("database", ri.DatabaseName))
return nil
}
// Stop stops the rqlite subprocess gracefully
func (ri *RQLiteInstance) Stop() error {
if ri.Connection != nil {
ri.Connection.Close()
ri.Connection = nil
}
if ri.Cmd == nil || ri.Cmd.Process == nil {
return nil
}
ri.logger.Info("Stopping RQLite instance",
zap.String("database", ri.DatabaseName))
// Try SIGTERM first
if err := ri.Cmd.Process.Signal(syscall.SIGTERM); err != nil {
// Fallback to Kill if signaling fails
_ = ri.Cmd.Process.Kill()
return nil
}
// Wait up to 5 seconds for graceful shutdown
done := make(chan error, 1)
go func() { done <- ri.Cmd.Wait() }()
select {
case err := <-done:
if err != nil && !errors.Is(err, os.ErrClosed) {
ri.logger.Warn("RQLite process exited with error",
zap.String("database", ri.DatabaseName),
zap.Error(err))
}
case <-time.After(5 * time.Second):
ri.logger.Warn("RQLite did not exit in time; killing",
zap.String("database", ri.DatabaseName))
_ = ri.Cmd.Process.Kill()
}
ri.Status = StatusHibernating
return nil
}
// waitForReady waits for RQLite HTTP endpoint to be ready
func (ri *RQLiteInstance) waitForReady(ctx context.Context) error {
url := fmt.Sprintf("http://localhost:%d/status", ri.HTTPPort)
client := &http.Client{Timeout: 2 * time.Second}
for i := 0; i < 30; i++ {
select {
case <-ctx.Done():
return ctx.Err()
default:
}
resp, err := client.Get(url)
if err == nil {
resp.Body.Close()
if resp.StatusCode == http.StatusOK {
return nil
}
}
time.Sleep(1 * time.Second)
}
return fmt.Errorf("rqlited did not become ready within timeout")
}
// waitForSQLAvailable waits until SQL queries can be executed
func (ri *RQLiteInstance) waitForSQLAvailable(ctx context.Context) error {
if ri.Connection == nil {
return errors.New("no rqlite connection")
}
ticker := time.NewTicker(1 * time.Second)
defer ticker.Stop()
for i := 0; i < 30; i++ {
select {
case <-ctx.Done():
return ctx.Err()
case <-ticker.C:
_, err := ri.Connection.QueryOne("SELECT 1")
if err == nil {
return nil
}
if i%5 == 0 {
ri.logger.Debug("Waiting for RQLite SQL availability",
zap.String("database", ri.DatabaseName),
zap.Error(err))
}
}
}
return fmt.Errorf("rqlited SQL not available within timeout")
}
// UpdateLastQuery updates the last query timestamp
func (ri *RQLiteInstance) UpdateLastQuery() {
ri.LastQuery = time.Now()
}
// IsIdle checks if the instance has been idle for the given duration
func (ri *RQLiteInstance) IsIdle(timeout time.Duration) bool {
return time.Since(ri.LastQuery) > timeout
}
// IsRunning checks if the rqlite process is running
func (ri *RQLiteInstance) IsRunning() bool {
if ri.Cmd == nil || ri.Cmd.Process == nil {
return false
}
// Check if process is still alive
err := ri.Cmd.Process.Signal(syscall.Signal(0))
return err == nil
}

153
pkg/rqlite/metadata.go Normal file
View File

@ -0,0 +1,153 @@
package rqlite
import (
"sync"
"time"
)
// DatabaseStatus represents the state of a database cluster
type DatabaseStatus string
const (
StatusInitializing DatabaseStatus = "initializing"
StatusActive DatabaseStatus = "active"
StatusHibernating DatabaseStatus = "hibernating"
StatusWaking DatabaseStatus = "waking"
)
// PortPair represents HTTP and Raft ports for a database instance
type PortPair struct {
HTTPPort int `json:"http_port"`
RaftPort int `json:"raft_port"`
}
// DatabaseMetadata contains metadata for a single database cluster
type DatabaseMetadata struct {
DatabaseName string `json:"database_name"` // e.g., "my_app_exampledb_1"
NodeIDs []string `json:"node_ids"` // Peer IDs hosting this database
PortMappings map[string]PortPair `json:"port_mappings"` // nodeID -> {HTTP port, Raft port}
Status DatabaseStatus `json:"status"` // Current status
CreatedAt time.Time `json:"created_at"`
LastAccessed time.Time `json:"last_accessed"`
LeaderNodeID string `json:"leader_node_id"` // Which node is rqlite leader
Version uint64 `json:"version"` // For conflict resolution
VectorClock map[string]uint64 `json:"vector_clock"` // For distributed consensus
}
// NodeCapacity represents capacity information for a node
type NodeCapacity struct {
NodeID string `json:"node_id"`
MaxDatabases int `json:"max_databases"` // Configured limit
CurrentDatabases int `json:"current_databases"` // How many currently active
PortRangeHTTP PortRange `json:"port_range_http"`
PortRangeRaft PortRange `json:"port_range_raft"`
LastHealthCheck time.Time `json:"last_health_check"`
IsHealthy bool `json:"is_healthy"`
}
// PortRange represents a range of available ports
type PortRange struct {
Start int `json:"start"`
End int `json:"end"`
}
// MetadataStore is an in-memory store for database metadata
type MetadataStore struct {
databases map[string]*DatabaseMetadata // key = database name
nodes map[string]*NodeCapacity // key = node ID
mu sync.RWMutex
}
// NewMetadataStore creates a new metadata store
func NewMetadataStore() *MetadataStore {
return &MetadataStore{
databases: make(map[string]*DatabaseMetadata),
nodes: make(map[string]*NodeCapacity),
}
}
// GetDatabase retrieves metadata for a database
func (ms *MetadataStore) GetDatabase(name string) *DatabaseMetadata {
ms.mu.RLock()
defer ms.mu.RUnlock()
if db, exists := ms.databases[name]; exists {
// Return a copy to prevent external modification
dbCopy := *db
return &dbCopy
}
return nil
}
// SetDatabase stores or updates metadata for a database
func (ms *MetadataStore) SetDatabase(db *DatabaseMetadata) {
ms.mu.Lock()
defer ms.mu.Unlock()
ms.databases[db.DatabaseName] = db
}
// DeleteDatabase removes metadata for a database
func (ms *MetadataStore) DeleteDatabase(name string) {
ms.mu.Lock()
defer ms.mu.Unlock()
delete(ms.databases, name)
}
// ListDatabases returns all database names
func (ms *MetadataStore) ListDatabases() []string {
ms.mu.RLock()
defer ms.mu.RUnlock()
names := make([]string, 0, len(ms.databases))
for name := range ms.databases {
names = append(names, name)
}
return names
}
// GetNode retrieves capacity info for a node
func (ms *MetadataStore) GetNode(nodeID string) *NodeCapacity {
ms.mu.RLock()
defer ms.mu.RUnlock()
if node, exists := ms.nodes[nodeID]; exists {
nodeCopy := *node
return &nodeCopy
}
return nil
}
// SetNode stores or updates capacity info for a node
func (ms *MetadataStore) SetNode(node *NodeCapacity) {
ms.mu.Lock()
defer ms.mu.Unlock()
ms.nodes[node.NodeID] = node
}
// DeleteNode removes capacity info for a node
func (ms *MetadataStore) DeleteNode(nodeID string) {
ms.mu.Lock()
defer ms.mu.Unlock()
delete(ms.nodes, nodeID)
}
// ListNodes returns all node IDs
func (ms *MetadataStore) ListNodes() []string {
ms.mu.RLock()
defer ms.mu.RUnlock()
ids := make([]string, 0, len(ms.nodes))
for id := range ms.nodes {
ids = append(ids, id)
}
return ids
}
// GetHealthyNodes returns IDs of healthy nodes
func (ms *MetadataStore) GetHealthyNodes() []string {
ms.mu.RLock()
defer ms.mu.RUnlock()
healthy := make([]string, 0)
for id, node := range ms.nodes {
if node.IsHealthy && node.CurrentDatabases < node.MaxDatabases {
healthy = append(healthy, id)
}
}
return healthy
}

View File

@ -1,442 +0,0 @@
package rqlite
import (
"context"
"database/sql"
"fmt"
"io/fs"
"os"
"path/filepath"
"sort"
"strconv"
"strings"
"unicode"
_ "github.com/rqlite/gorqlite/stdlib"
"go.uber.org/zap"
)
// ApplyMigrations scans a directory for *.sql files, orders them by numeric prefix,
// and applies any that are not yet recorded in schema_migrations(version).
func ApplyMigrations(ctx context.Context, db *sql.DB, dir string, logger *zap.Logger) error {
if logger == nil {
logger = zap.NewNop()
}
if err := ensureMigrationsTable(ctx, db); err != nil {
return fmt.Errorf("ensure schema_migrations: %w", err)
}
files, err := readMigrationFiles(dir)
if err != nil {
return fmt.Errorf("read migration files: %w", err)
}
if len(files) == 0 {
logger.Info("No migrations found", zap.String("dir", dir))
return nil
}
applied, err := loadAppliedVersions(ctx, db)
if err != nil {
return fmt.Errorf("load applied versions: %w", err)
}
for _, mf := range files {
if applied[mf.Version] {
logger.Info("Migration already applied; skipping", zap.Int("version", mf.Version), zap.String("name", mf.Name))
continue
}
sqlBytes, err := os.ReadFile(mf.Path)
if err != nil {
return fmt.Errorf("read migration %s: %w", mf.Path, err)
}
logger.Info("Applying migration", zap.Int("version", mf.Version), zap.String("name", mf.Name))
if err := applySQL(ctx, db, string(sqlBytes)); err != nil {
return fmt.Errorf("apply migration %d (%s): %w", mf.Version, mf.Name, err)
}
if _, err := db.ExecContext(ctx, `INSERT OR IGNORE INTO schema_migrations(version) VALUES (?)`, mf.Version); err != nil {
return fmt.Errorf("record migration %d: %w", mf.Version, err)
}
logger.Info("Migration applied", zap.Int("version", mf.Version), zap.String("name", mf.Name))
}
return nil
}
// ApplyMigrationsDirs applies migrations from multiple directories.
// - Gathers *.sql files from each dir
// - Parses numeric prefix as the version
// - Errors if the same version appears in more than one dir (to avoid ambiguity)
// - Sorts globally by version and applies those not yet in schema_migrations
func ApplyMigrationsDirs(ctx context.Context, db *sql.DB, dirs []string, logger *zap.Logger) error {
if logger == nil {
logger = zap.NewNop()
}
if err := ensureMigrationsTable(ctx, db); err != nil {
return fmt.Errorf("ensure schema_migrations: %w", err)
}
files, err := readMigrationFilesFromDirs(dirs)
if err != nil {
return err
}
if len(files) == 0 {
logger.Info("No migrations found in provided directories", zap.Strings("dirs", dirs))
return nil
}
applied, err := loadAppliedVersions(ctx, db)
if err != nil {
return fmt.Errorf("load applied versions: %w", err)
}
for _, mf := range files {
if applied[mf.Version] {
logger.Info("Migration already applied; skipping", zap.Int("version", mf.Version), zap.String("name", mf.Name), zap.String("path", mf.Path))
continue
}
sqlBytes, err := os.ReadFile(mf.Path)
if err != nil {
return fmt.Errorf("read migration %s: %w", mf.Path, err)
}
logger.Info("Applying migration", zap.Int("version", mf.Version), zap.String("name", mf.Name), zap.String("path", mf.Path))
if err := applySQL(ctx, db, string(sqlBytes)); err != nil {
return fmt.Errorf("apply migration %d (%s): %w", mf.Version, mf.Name, err)
}
if _, err := db.ExecContext(ctx, `INSERT OR IGNORE INTO schema_migrations(version) VALUES (?)`, mf.Version); err != nil {
return fmt.Errorf("record migration %d: %w", mf.Version, err)
}
logger.Info("Migration applied", zap.Int("version", mf.Version), zap.String("name", mf.Name))
}
return nil
}
// ApplyMigrationsFromManager is a convenience helper bound to RQLiteManager.
func (r *RQLiteManager) ApplyMigrations(ctx context.Context, dir string) error {
db, err := sql.Open("rqlite", fmt.Sprintf("http://localhost:%d", r.config.RQLitePort))
if err != nil {
return fmt.Errorf("open rqlite db: %w", err)
}
defer db.Close()
return ApplyMigrations(ctx, db, dir, r.logger)
}
// ApplyMigrationsDirs is the multi-dir variant on RQLiteManager.
func (r *RQLiteManager) ApplyMigrationsDirs(ctx context.Context, dirs []string) error {
db, err := sql.Open("rqlite", fmt.Sprintf("http://localhost:%d", r.config.RQLitePort))
if err != nil {
return fmt.Errorf("open rqlite db: %w", err)
}
defer db.Close()
return ApplyMigrationsDirs(ctx, db, dirs, r.logger)
}
func ensureMigrationsTable(ctx context.Context, db *sql.DB) error {
_, err := db.ExecContext(ctx, `
CREATE TABLE IF NOT EXISTS schema_migrations (
version INTEGER PRIMARY KEY,
applied_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
)`)
return err
}
type migrationFile struct {
Version int
Name string
Path string
}
func readMigrationFiles(dir string) ([]migrationFile, error) {
entries, err := os.ReadDir(dir)
if err != nil {
if os.IsNotExist(err) {
return []migrationFile{}, nil
}
return nil, err
}
var out []migrationFile
for _, e := range entries {
if e.IsDir() {
continue
}
name := e.Name()
if !strings.HasSuffix(strings.ToLower(name), ".sql") {
continue
}
ver, ok := parseVersionPrefix(name)
if !ok {
continue
}
out = append(out, migrationFile{
Version: ver,
Name: name,
Path: filepath.Join(dir, name),
})
}
sort.Slice(out, func(i, j int) bool { return out[i].Version < out[j].Version })
return out, nil
}
func readMigrationFilesFromDirs(dirs []string) ([]migrationFile, error) {
all := make([]migrationFile, 0, 64)
seen := map[int]string{} // version -> path (for duplicate detection)
for _, d := range dirs {
files, err := readMigrationFiles(d)
if err != nil {
return nil, fmt.Errorf("reading dir %s: %w", d, err)
}
for _, f := range files {
if prev, dup := seen[f.Version]; dup {
return nil, fmt.Errorf("duplicate migration version %d detected in %s and %s; ensure global version uniqueness", f.Version, prev, f.Path)
}
seen[f.Version] = f.Path
all = append(all, f)
}
}
sort.Slice(all, func(i, j int) bool { return all[i].Version < all[j].Version })
return all, nil
}
func parseVersionPrefix(name string) (int, bool) {
// Expect formats like "001_initial.sql", "2_add_table.sql", etc.
i := 0
for i < len(name) && unicode.IsDigit(rune(name[i])) {
i++
}
if i == 0 {
return 0, false
}
ver, err := strconv.Atoi(name[:i])
if err != nil {
return 0, false
}
return ver, true
}
func loadAppliedVersions(ctx context.Context, db *sql.DB) (map[int]bool, error) {
rows, err := db.QueryContext(ctx, `SELECT version FROM schema_migrations`)
if err != nil {
// If the table doesn't exist yet (very first run), ensure it and return empty set.
if isNoSuchTable(err) {
if err := ensureMigrationsTable(ctx, db); err != nil {
return nil, err
}
return map[int]bool{}, nil
}
return nil, err
}
defer rows.Close()
applied := make(map[int]bool)
for rows.Next() {
var v int
if err := rows.Scan(&v); err != nil {
return nil, err
}
applied[v] = true
}
return applied, rows.Err()
}
func isNoSuchTable(err error) bool {
// rqlite/sqlite error messages vary; keep it permissive
msg := strings.ToLower(err.Error())
return strings.Contains(msg, "no such table") || strings.Contains(msg, "does not exist")
}
// applySQL splits the script into individual statements, strips explicit
// transaction control (BEGIN/COMMIT/ROLLBACK/END), and executes statements
// sequentially to avoid nested transaction issues with rqlite.
func applySQL(ctx context.Context, db *sql.DB, script string) error {
s := strings.TrimSpace(script)
if s == "" {
return nil
}
stmts := splitSQLStatements(s)
stmts = filterOutTxnControls(stmts)
for _, stmt := range stmts {
if strings.TrimSpace(stmt) == "" {
continue
}
if _, err := db.ExecContext(ctx, stmt); err != nil {
return fmt.Errorf("exec stmt failed: %w (stmt: %s)", err, snippet(stmt))
}
}
return nil
}
func containsToken(stmts []string, token string) bool {
for _, s := range stmts {
if strings.EqualFold(strings.TrimSpace(s), token) {
return true
}
}
return false
}
// removed duplicate helper
// removed duplicate helper
// isTxnControl returns true if the statement is a transaction control command.
func isTxnControl(s string) bool {
t := strings.ToUpper(strings.TrimSpace(s))
switch t {
case "BEGIN", "BEGIN TRANSACTION", "COMMIT", "END", "ROLLBACK":
return true
default:
return false
}
}
// filterOutTxnControls removes BEGIN/COMMIT/ROLLBACK/END statements.
func filterOutTxnControls(stmts []string) []string {
out := make([]string, 0, len(stmts))
for _, s := range stmts {
if isTxnControl(s) {
continue
}
out = append(out, s)
}
return out
}
func snippet(s string) string {
s = strings.TrimSpace(s)
if len(s) > 120 {
return s[:120] + "..."
}
return s
}
// splitSQLStatements splits a SQL script into statements by semicolon, ignoring semicolons
// inside single/double-quoted strings and skipping comments (-- and /* */).
func splitSQLStatements(in string) []string {
var out []string
var b strings.Builder
inLineComment := false
inBlockComment := false
inSingle := false
inDouble := false
runes := []rune(in)
for i := 0; i < len(runes); i++ {
ch := runes[i]
next := rune(0)
if i+1 < len(runes) {
next = runes[i+1]
}
// Handle end of line comment
if inLineComment {
if ch == '\n' {
inLineComment = false
// keep newline normalization but don't include comment
}
continue
}
// Handle end of block comment
if inBlockComment {
if ch == '*' && next == '/' {
inBlockComment = false
i++
}
continue
}
// Start of comments?
if !inSingle && !inDouble {
if ch == '-' && next == '-' {
inLineComment = true
i++
continue
}
if ch == '/' && next == '*' {
inBlockComment = true
i++
continue
}
}
// Quotes
if !inDouble && ch == '\'' {
// Toggle single quotes, respecting escaped '' inside.
if inSingle {
// Check for escaped '' (two single quotes)
if next == '\'' {
b.WriteRune(ch) // write one '
i++ // skip the next '
continue
}
inSingle = false
} else {
inSingle = true
}
b.WriteRune(ch)
continue
}
if !inSingle && ch == '"' {
if inDouble {
if next == '"' {
b.WriteRune(ch)
i++
continue
}
inDouble = false
} else {
inDouble = true
}
b.WriteRune(ch)
continue
}
// Statement boundary
if ch == ';' && !inSingle && !inDouble {
stmt := strings.TrimSpace(b.String())
if stmt != "" {
out = append(out, stmt)
}
b.Reset()
continue
}
b.WriteRune(ch)
}
// Final fragment
if s := strings.TrimSpace(b.String()); s != "" {
out = append(out, s)
}
return out
}
// Optional helper to load embedded migrations if you later decide to embed.
// Keep for future use; currently unused.
func readDirFS(fsys fs.FS, root string) ([]string, error) {
var files []string
err := fs.WalkDir(fsys, root, func(path string, d fs.DirEntry, err error) error {
if err != nil {
return err
}
if d.IsDir() {
return nil
}
if strings.HasSuffix(strings.ToLower(d.Name()), ".sql") {
files = append(files, path)
}
return nil
})
return files, err
}

208
pkg/rqlite/ports.go Normal file
View File

@ -0,0 +1,208 @@
package rqlite
import (
"errors"
"fmt"
"math/rand"
"net"
"sync"
)
// PortManager manages port allocation for database instances
type PortManager struct {
allocatedPorts map[int]string // port -> database name
httpRange PortRange
raftRange PortRange
mu sync.RWMutex
}
// NewPortManager creates a new port manager
func NewPortManager(httpRange, raftRange PortRange) *PortManager {
return &PortManager{
allocatedPorts: make(map[int]string),
httpRange: httpRange,
raftRange: raftRange,
}
}
// AllocatePortPair allocates a pair of ports (HTTP and Raft) for a database
func (pm *PortManager) AllocatePortPair(dbName string) (PortPair, error) {
pm.mu.Lock()
defer pm.mu.Unlock()
// Try up to 20 times to find available ports
for attempt := 0; attempt < 20; attempt++ {
httpPort := pm.randomPortInRange(pm.httpRange)
raftPort := pm.randomPortInRange(pm.raftRange)
// Check if already allocated
if _, exists := pm.allocatedPorts[httpPort]; exists {
continue
}
if _, exists := pm.allocatedPorts[raftPort]; exists {
continue
}
// Test if actually bindable
if !pm.canBind(httpPort) || !pm.canBind(raftPort) {
continue
}
// Allocate the ports
pm.allocatedPorts[httpPort] = dbName
pm.allocatedPorts[raftPort] = dbName
return PortPair{HTTPPort: httpPort, RaftPort: raftPort}, nil
}
return PortPair{}, errors.New("no available ports after 20 attempts")
}
// ReleasePortPair releases a pair of ports back to the pool
func (pm *PortManager) ReleasePortPair(pair PortPair) {
pm.mu.Lock()
defer pm.mu.Unlock()
delete(pm.allocatedPorts, pair.HTTPPort)
delete(pm.allocatedPorts, pair.RaftPort)
}
// IsPortPairAvailable checks if a specific port pair is available
func (pm *PortManager) IsPortPairAvailable(pair PortPair) bool {
pm.mu.RLock()
defer pm.mu.RUnlock()
// Check if ports are in range
if !pm.isInRange(pair.HTTPPort, pm.httpRange) {
return false
}
if !pm.isInRange(pair.RaftPort, pm.raftRange) {
return false
}
// Check if already allocated
if _, exists := pm.allocatedPorts[pair.HTTPPort]; exists {
return false
}
if _, exists := pm.allocatedPorts[pair.RaftPort]; exists {
return false
}
// Test if actually bindable
return pm.canBind(pair.HTTPPort) && pm.canBind(pair.RaftPort)
}
// AllocateSpecificPortPair attempts to allocate a specific port pair
func (pm *PortManager) AllocateSpecificPortPair(dbName string, pair PortPair) error {
pm.mu.Lock()
defer pm.mu.Unlock()
// Check if ports are in range
if !pm.isInRange(pair.HTTPPort, pm.httpRange) {
return fmt.Errorf("HTTP port %d not in range %d-%d", pair.HTTPPort, pm.httpRange.Start, pm.httpRange.End)
}
if !pm.isInRange(pair.RaftPort, pm.raftRange) {
return fmt.Errorf("Raft port %d not in range %d-%d", pair.RaftPort, pm.raftRange.Start, pm.raftRange.End)
}
// Check if already allocated
if _, exists := pm.allocatedPorts[pair.HTTPPort]; exists {
return fmt.Errorf("HTTP port %d already allocated", pair.HTTPPort)
}
if _, exists := pm.allocatedPorts[pair.RaftPort]; exists {
return fmt.Errorf("Raft port %d already allocated", pair.RaftPort)
}
// Test if actually bindable
if !pm.canBind(pair.HTTPPort) {
return fmt.Errorf("HTTP port %d not bindable", pair.HTTPPort)
}
if !pm.canBind(pair.RaftPort) {
return fmt.Errorf("Raft port %d not bindable", pair.RaftPort)
}
// Allocate the ports
pm.allocatedPorts[pair.HTTPPort] = dbName
pm.allocatedPorts[pair.RaftPort] = dbName
return nil
}
// GetAllocatedPorts returns all currently allocated ports
func (pm *PortManager) GetAllocatedPorts() map[int]string {
pm.mu.RLock()
defer pm.mu.RUnlock()
// Return a copy
copy := make(map[int]string, len(pm.allocatedPorts))
for port, db := range pm.allocatedPorts {
copy[port] = db
}
return copy
}
// GetAvailablePortCount returns the approximate number of available ports
func (pm *PortManager) GetAvailablePortCount() int {
pm.mu.RLock()
defer pm.mu.RUnlock()
httpCount := pm.httpRange.End - pm.httpRange.Start + 1
raftCount := pm.raftRange.End - pm.raftRange.Start + 1
// Return the minimum of the two (since we need pairs)
totalPairs := httpCount
if raftCount < httpCount {
totalPairs = raftCount
}
return totalPairs - len(pm.allocatedPorts)/2
}
// IsPortAllocated checks if a port is currently allocated
func (pm *PortManager) IsPortAllocated(port int) bool {
pm.mu.RLock()
defer pm.mu.RUnlock()
_, exists := pm.allocatedPorts[port]
return exists
}
// AllocateSpecificPorts allocates specific ports for a database
func (pm *PortManager) AllocateSpecificPorts(dbName string, ports PortPair) error {
pm.mu.Lock()
defer pm.mu.Unlock()
// Check if ports are already allocated
if _, exists := pm.allocatedPorts[ports.HTTPPort]; exists {
return fmt.Errorf("HTTP port %d already allocated", ports.HTTPPort)
}
if _, exists := pm.allocatedPorts[ports.RaftPort]; exists {
return fmt.Errorf("Raft port %d already allocated", ports.RaftPort)
}
// Allocate the ports
pm.allocatedPorts[ports.HTTPPort] = dbName
pm.allocatedPorts[ports.RaftPort] = dbName
return nil
}
// randomPortInRange returns a random port within the given range
func (pm *PortManager) randomPortInRange(portRange PortRange) int {
return portRange.Start + rand.Intn(portRange.End-portRange.Start+1)
}
// isInRange checks if a port is within the given range
func (pm *PortManager) isInRange(port int, portRange PortRange) bool {
return port >= portRange.Start && port <= portRange.End
}
// canBind tests if a port can be bound
func (pm *PortManager) canBind(port int) bool {
// Test bind to check if port is actually available
listener, err := net.Listen("tcp", fmt.Sprintf(":%d", port))
if err != nil {
return false
}
listener.Close()
return true
}

View File

@ -0,0 +1,204 @@
package rqlite
import (
"encoding/json"
"time"
)
// MessageType represents the type of metadata message
type MessageType string
const (
// Database lifecycle
MsgDatabaseCreateRequest MessageType = "DATABASE_CREATE_REQUEST"
MsgDatabaseCreateResponse MessageType = "DATABASE_CREATE_RESPONSE"
MsgDatabaseCreateConfirm MessageType = "DATABASE_CREATE_CONFIRM"
MsgDatabaseStatusUpdate MessageType = "DATABASE_STATUS_UPDATE"
MsgDatabaseDelete MessageType = "DATABASE_DELETE"
// Hibernation
MsgDatabaseIdleNotification MessageType = "DATABASE_IDLE_NOTIFICATION"
MsgDatabaseShutdownCoordinated MessageType = "DATABASE_SHUTDOWN_COORDINATED"
MsgDatabaseWakeupRequest MessageType = "DATABASE_WAKEUP_REQUEST"
// Node management
MsgNodeCapacityAnnouncement MessageType = "NODE_CAPACITY_ANNOUNCEMENT"
MsgNodeHealthPing MessageType = "NODE_HEALTH_PING"
MsgNodeHealthPong MessageType = "NODE_HEALTH_PONG"
// Failure handling
MsgNodeReplacementNeeded MessageType = "NODE_REPLACEMENT_NEEDED"
MsgNodeReplacementOffer MessageType = "NODE_REPLACEMENT_OFFER"
MsgNodeReplacementConfirm MessageType = "NODE_REPLACEMENT_CONFIRM"
MsgDatabaseCleanup MessageType = "DATABASE_CLEANUP"
// Gossip
MsgMetadataSync MessageType = "METADATA_SYNC"
MsgMetadataChecksumReq MessageType = "METADATA_CHECKSUM_REQUEST"
MsgMetadataChecksumRes MessageType = "METADATA_CHECKSUM_RESPONSE"
)
// MetadataMessage is the envelope for all metadata messages
type MetadataMessage struct {
Type MessageType `json:"type"`
Timestamp time.Time `json:"timestamp"`
NodeID string `json:"node_id"` // Sender
Payload json.RawMessage `json:"payload"`
}
// DatabaseCreateRequest is sent when a client wants to create a new database
type DatabaseCreateRequest struct {
DatabaseName string `json:"database_name"`
RequesterNodeID string `json:"requester_node_id"`
ReplicationFactor int `json:"replication_factor"`
}
// DatabaseCreateResponse is sent by eligible nodes offering to host the database
type DatabaseCreateResponse struct {
DatabaseName string `json:"database_name"`
NodeID string `json:"node_id"`
AvailablePorts PortPair `json:"available_ports"`
}
// DatabaseCreateConfirm is sent by the coordinator with the final membership
type DatabaseCreateConfirm struct {
DatabaseName string `json:"database_name"`
SelectedNodes []NodeAssignment `json:"selected_nodes"`
CoordinatorNodeID string `json:"coordinator_node_id"`
}
// NodeAssignment represents a node assignment in a database cluster
type NodeAssignment struct {
NodeID string `json:"node_id"`
HTTPPort int `json:"http_port"`
RaftPort int `json:"raft_port"`
Role string `json:"role"` // "leader" or "follower"
}
// DatabaseStatusUpdate is sent when a database changes status
type DatabaseStatusUpdate struct {
DatabaseName string `json:"database_name"`
NodeID string `json:"node_id"`
Status DatabaseStatus `json:"status"`
HTTPPort int `json:"http_port,omitempty"`
RaftPort int `json:"raft_port,omitempty"`
}
// DatabaseIdleNotification is sent when a node detects idle database
type DatabaseIdleNotification struct {
DatabaseName string `json:"database_name"`
NodeID string `json:"node_id"`
LastActivity time.Time `json:"last_activity"`
}
// DatabaseShutdownCoordinated is sent to coordinate hibernation shutdown
type DatabaseShutdownCoordinated struct {
DatabaseName string `json:"database_name"`
ShutdownTime time.Time `json:"shutdown_time"` // When to actually shutdown
}
// DatabaseWakeupRequest is sent to wake up a hibernating database
type DatabaseWakeupRequest struct {
DatabaseName string `json:"database_name"`
RequesterNodeID string `json:"requester_node_id"`
}
// NodeCapacityAnnouncement is sent periodically to announce node capacity
type NodeCapacityAnnouncement struct {
NodeID string `json:"node_id"`
MaxDatabases int `json:"max_databases"`
CurrentDatabases int `json:"current_databases"`
PortRangeHTTP PortRange `json:"port_range_http"`
PortRangeRaft PortRange `json:"port_range_raft"`
}
// NodeHealthPing is sent periodically for health checks
type NodeHealthPing struct {
NodeID string `json:"node_id"`
CurrentDatabases int `json:"current_databases"`
}
// NodeHealthPong is the response to a health ping
type NodeHealthPong struct {
NodeID string `json:"node_id"`
Healthy bool `json:"healthy"`
PingFrom string `json:"ping_from"`
}
// NodeReplacementNeeded is sent when a node failure is detected
type NodeReplacementNeeded struct {
DatabaseName string `json:"database_name"`
FailedNodeID string `json:"failed_node_id"`
CurrentNodes []string `json:"current_nodes"`
ReplicationFactor int `json:"replication_factor"`
}
// NodeReplacementOffer is sent by nodes offering to replace a failed node
type NodeReplacementOffer struct {
DatabaseName string `json:"database_name"`
NodeID string `json:"node_id"`
AvailablePorts PortPair `json:"available_ports"`
}
// NodeReplacementConfirm is sent when a replacement node is selected
type NodeReplacementConfirm struct {
DatabaseName string `json:"database_name"`
NewNodeID string `json:"new_node_id"`
ReplacedNodeID string `json:"replaced_node_id"`
NewNodePorts PortPair `json:"new_node_ports"`
JoinAddress string `json:"join_address"`
}
// DatabaseCleanup is sent to trigger cleanup of orphaned data
type DatabaseCleanup struct {
DatabaseName string `json:"database_name"`
NodeID string `json:"node_id"`
Action string `json:"action"` // e.g., "deleted_orphaned_data"
}
// MetadataSync contains full database metadata for synchronization
type MetadataSync struct {
Metadata *DatabaseMetadata `json:"metadata"`
}
// MetadataChecksumRequest requests checksums from other nodes
type MetadataChecksumRequest struct {
RequestID string `json:"request_id"`
}
// MetadataChecksumResponse contains checksums for all databases
type MetadataChecksumResponse struct {
RequestID string `json:"request_id"`
Checksums []MetadataChecksum `json:"checksums"`
}
// MarshalMetadataMessage creates a MetadataMessage with the given payload
func MarshalMetadataMessage(msgType MessageType, nodeID string, payload interface{}) ([]byte, error) {
payloadBytes, err := json.Marshal(payload)
if err != nil {
return nil, err
}
msg := MetadataMessage{
Type: msgType,
Timestamp: time.Now(),
NodeID: nodeID,
Payload: payloadBytes,
}
return json.Marshal(msg)
}
// UnmarshalMetadataMessage parses a MetadataMessage
func UnmarshalMetadataMessage(data []byte) (*MetadataMessage, error) {
var msg MetadataMessage
if err := json.Unmarshal(data, &msg); err != nil {
return nil, err
}
return &msg, nil
}
// UnmarshalPayload unmarshals the payload into the given type
func (msg *MetadataMessage) UnmarshalPayload(v interface{}) error {
return json.Unmarshal(msg.Payload, v)
}

View File

@ -1,362 +0,0 @@
package rqlite
import (
"context"
"errors"
"fmt"
"net/http"
"os"
"os/exec"
"path/filepath"
"strings"
"syscall"
"time"
"github.com/rqlite/gorqlite"
"go.uber.org/zap"
"github.com/DeBrosOfficial/network/pkg/config"
)
// RQLiteManager manages an RQLite node instance
type RQLiteManager struct {
config *config.DatabaseConfig
discoverConfig *config.DiscoveryConfig
dataDir string
logger *zap.Logger
cmd *exec.Cmd
connection *gorqlite.Connection
}
// waitForSQLAvailable waits until a simple query succeeds, indicating a leader is known and queries can be served.
func (r *RQLiteManager) waitForSQLAvailable(ctx context.Context) error {
if r.connection == nil {
r.logger.Error("No rqlite connection")
return errors.New("no rqlite connection")
}
ticker := time.NewTicker(1 * time.Second)
defer ticker.Stop()
attempts := 0
for {
select {
case <-ctx.Done():
return ctx.Err()
case <-ticker.C:
attempts++
_, err := r.connection.QueryOne("SELECT 1")
if err == nil {
r.logger.Info("RQLite SQL is available")
return nil
}
if attempts%5 == 0 { // log every ~5s to reduce noise
r.logger.Debug("Waiting for RQLite SQL availability", zap.Error(err))
}
}
}
}
// NewRQLiteManager creates a new RQLite manager
func NewRQLiteManager(cfg *config.DatabaseConfig, discoveryCfg *config.DiscoveryConfig, dataDir string, logger *zap.Logger) *RQLiteManager {
return &RQLiteManager{
config: cfg,
discoverConfig: discoveryCfg,
dataDir: dataDir,
logger: logger,
}
}
// Start starts the RQLite node
func (r *RQLiteManager) Start(ctx context.Context) error {
// Create data directory
rqliteDataDir := filepath.Join(r.dataDir, "rqlite")
if err := os.MkdirAll(rqliteDataDir, 0755); err != nil {
return fmt.Errorf("failed to create RQLite data directory: %w", err)
}
if r.discoverConfig.HttpAdvAddress == "" {
return fmt.Errorf("discovery config HttpAdvAddress is empty")
}
// Build RQLite command
args := []string{
"-http-addr", fmt.Sprintf("0.0.0.0:%d", r.config.RQLitePort),
"-http-adv-addr", r.discoverConfig.HttpAdvAddress,
"-raft-adv-addr", r.discoverConfig.RaftAdvAddress,
"-raft-addr", fmt.Sprintf("0.0.0.0:%d", r.config.RQLiteRaftPort),
}
// Add join address if specified (for non-bootstrap or secondary bootstrap nodes)
if r.config.RQLiteJoinAddress != "" {
r.logger.Info("Joining RQLite cluster", zap.String("join_address", r.config.RQLiteJoinAddress))
// Normalize join address to host:port for rqlited -join
joinArg := r.config.RQLiteJoinAddress
if strings.HasPrefix(joinArg, "http://") {
joinArg = strings.TrimPrefix(joinArg, "http://")
} else if strings.HasPrefix(joinArg, "https://") {
joinArg = strings.TrimPrefix(joinArg, "https://")
}
// Wait for join target to become reachable to avoid forming a separate cluster (wait indefinitely)
if err := r.waitForJoinTarget(ctx, joinArg, 0); err != nil {
r.logger.Warn("Join target did not become reachable within timeout; will still attempt to join",
zap.String("join_address", r.config.RQLiteJoinAddress),
zap.Error(err))
}
// Always add the join parameter in host:port form - let rqlited handle the rest
args = append(args, "-join", joinArg)
} else {
r.logger.Info("No join address specified - starting as new cluster")
}
// Add data directory as positional argument
args = append(args, rqliteDataDir)
r.logger.Info("Starting RQLite node",
zap.String("data_dir", rqliteDataDir),
zap.Int("http_port", r.config.RQLitePort),
zap.Int("raft_port", r.config.RQLiteRaftPort),
zap.String("join_address", r.config.RQLiteJoinAddress),
zap.Strings("full_args", args),
)
// Start RQLite process (not bound to ctx for graceful Stop handling)
r.cmd = exec.Command("rqlited", args...)
// Uncomment if you want to see the stdout/stderr of the RQLite process
// r.cmd.Stdout = os.Stdout
// r.cmd.Stderr = os.Stderr
if err := r.cmd.Start(); err != nil {
return fmt.Errorf("failed to start RQLite: %w", err)
}
// Wait for RQLite to be ready
if err := r.waitForReady(ctx); err != nil {
if r.cmd != nil && r.cmd.Process != nil {
_ = r.cmd.Process.Kill()
}
return fmt.Errorf("RQLite failed to become ready: %w", err)
}
// Create connection
conn, err := gorqlite.Open(fmt.Sprintf("http://localhost:%d", r.config.RQLitePort))
if err != nil {
if r.cmd != nil && r.cmd.Process != nil {
_ = r.cmd.Process.Kill()
}
return fmt.Errorf("failed to connect to RQLite: %w", err)
}
r.connection = conn
// Leadership/SQL readiness gating
//
// Fresh bootstrap (no join, no prior state): wait for leadership so queries will work.
// Existing state or joiners: wait for SQL availability (leader known) before proceeding,
// so higher layers (storage) don't fail with 500 leader-not-found.
if r.config.RQLiteJoinAddress == "" && !r.hasExistingState(rqliteDataDir) {
if err := r.waitForLeadership(ctx); err != nil {
if r.cmd != nil && r.cmd.Process != nil {
_ = r.cmd.Process.Kill()
}
return fmt.Errorf("RQLite failed to establish leadership: %w", err)
}
} else {
r.logger.Info("Waiting for RQLite SQL availability (leader discovery)")
if err := r.waitForSQLAvailable(ctx); err != nil {
if r.cmd != nil && r.cmd.Process != nil {
_ = r.cmd.Process.Kill()
}
return fmt.Errorf("RQLite SQL not available: %w", err)
}
}
// After waitForLeadership / waitForSQLAvailable succeeds, before returning:
migrationsDir := "migrations"
if err := r.ApplyMigrations(ctx, migrationsDir); err != nil {
r.logger.Error("Migrations failed", zap.Error(err), zap.String("dir", migrationsDir))
return fmt.Errorf("apply migrations: %w", err)
}
r.logger.Info("RQLite node started successfully")
return nil
}
// hasExistingState returns true if the rqlite data directory already contains files or subdirectories.
func (r *RQLiteManager) hasExistingState(rqliteDataDir string) bool {
entries, err := os.ReadDir(rqliteDataDir)
if err != nil {
return false
}
for _, e := range entries {
// Any existing file or directory indicates prior state
if e.Name() == "." || e.Name() == ".." {
continue
}
return true
}
return false
}
// waitForReady waits for RQLite to be ready to accept connections
func (r *RQLiteManager) waitForReady(ctx context.Context) error {
url := fmt.Sprintf("http://localhost:%d/status", r.config.RQLitePort)
client := &http.Client{Timeout: 2 * time.Second}
for i := 0; i < 30; i++ {
select {
case <-ctx.Done():
return ctx.Err()
default:
}
resp, err := client.Get(url)
if err == nil {
resp.Body.Close()
if resp.StatusCode == http.StatusOK {
return nil
}
}
time.Sleep(1 * time.Second)
}
return fmt.Errorf("RQLite did not become ready within timeout")
}
// waitForLeadership waits for RQLite to establish leadership (for bootstrap nodes)
func (r *RQLiteManager) waitForLeadership(ctx context.Context) error {
r.logger.Info("Waiting for RQLite to establish leadership...")
for i := 0; i < 30; i++ {
select {
case <-ctx.Done():
return ctx.Err()
default:
}
// Try a simple query to check if leadership is established
if r.connection != nil {
_, err := r.connection.QueryOne("SELECT 1")
if err == nil {
r.logger.Info("RQLite leadership established")
return nil
}
r.logger.Debug("Waiting for leadership", zap.Error(err))
}
time.Sleep(1 * time.Second)
}
return fmt.Errorf("RQLite failed to establish leadership within timeout")
}
// GetConnection returns the RQLite connection
func (r *RQLiteManager) GetConnection() *gorqlite.Connection {
return r.connection
}
// Stop stops the RQLite node
func (r *RQLiteManager) Stop() error {
if r.connection != nil {
r.connection.Close()
r.connection = nil
}
if r.cmd == nil || r.cmd.Process == nil {
return nil
}
r.logger.Info("Stopping RQLite node (graceful)")
// Try SIGTERM first
if err := r.cmd.Process.Signal(syscall.SIGTERM); err != nil {
// Fallback to Kill if signaling fails
_ = r.cmd.Process.Kill()
return nil
}
// Wait up to 5 seconds for graceful shutdown
done := make(chan error, 1)
go func() { done <- r.cmd.Wait() }()
select {
case err := <-done:
if err != nil && !errors.Is(err, os.ErrClosed) {
r.logger.Warn("RQLite process exited with error", zap.Error(err))
}
case <-time.After(5 * time.Second):
r.logger.Warn("RQLite did not exit in time; killing")
_ = r.cmd.Process.Kill()
}
return nil
}
// waitForJoinTarget waits until the join target's HTTP status becomes reachable, or until timeout
func (r *RQLiteManager) waitForJoinTarget(ctx context.Context, joinAddress string, timeout time.Duration) error {
var deadline time.Time
if timeout > 0 {
deadline = time.Now().Add(timeout)
}
var lastErr error
for {
if err := r.testJoinAddress(joinAddress); err == nil {
r.logger.Info("Join target is reachable, proceeding with cluster join")
return nil
} else {
lastErr = err
r.logger.Debug("Join target not yet reachable; waiting...", zap.String("join_address", joinAddress), zap.Error(err))
}
// Check context
select {
case <-ctx.Done():
return ctx.Err()
case <-time.After(2 * time.Second):
}
if !deadline.IsZero() && time.Now().After(deadline) {
break
}
}
return lastErr
}
// testJoinAddress tests if a join address is reachable
func (r *RQLiteManager) testJoinAddress(joinAddress string) error {
// Determine the HTTP status URL to probe.
// If joinAddress contains a scheme, use it directly. Otherwise treat joinAddress
// as host:port (Raft) and probe the standard HTTP API port 5001 on that host.
client := &http.Client{Timeout: 5 * time.Second}
var statusURL string
if strings.HasPrefix(joinAddress, "http://") || strings.HasPrefix(joinAddress, "https://") {
statusURL = strings.TrimRight(joinAddress, "/") + "/status"
} else {
// Extract host from host:port
host := joinAddress
if idx := strings.Index(joinAddress, ":"); idx != -1 {
host = joinAddress[:idx]
}
statusURL = fmt.Sprintf("http://%s:%d/status", host, 5001)
}
r.logger.Debug("Testing join target via HTTP", zap.String("url", statusURL))
resp, err := client.Get(statusURL)
if err != nil {
return fmt.Errorf("failed to connect to leader HTTP at %s: %w", statusURL, err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
return fmt.Errorf("leader HTTP at %s returned status %d", statusURL, resp.StatusCode)
}
r.logger.Info("Leader HTTP reachable", zap.String("status_url", statusURL))
return nil
}

View File

@ -0,0 +1,81 @@
package rqlite
// VectorClock represents a vector clock for distributed consistency
type VectorClock map[string]uint64
// NewVectorClock creates a new vector clock
func NewVectorClock() VectorClock {
return make(VectorClock)
}
// Increment increments the clock for a given node
func (vc VectorClock) Increment(nodeID string) {
vc[nodeID]++
}
// Update updates the vector clock with values from another clock
func (vc VectorClock) Update(other VectorClock) {
for nodeID, value := range other {
if existing, exists := vc[nodeID]; !exists || value > existing {
vc[nodeID] = value
}
}
}
// Copy creates a copy of the vector clock
func (vc VectorClock) Copy() VectorClock {
copy := make(VectorClock, len(vc))
for k, v := range vc {
copy[k] = v
}
return copy
}
// Compare compares two vector clocks
// Returns: -1 if vc < other, 0 if concurrent, 1 if vc > other
func (vc VectorClock) Compare(other VectorClock) int {
less := false
greater := false
// Check all keys in both clocks
allKeys := make(map[string]bool)
for k := range vc {
allKeys[k] = true
}
for k := range other {
allKeys[k] = true
}
for k := range allKeys {
v1 := vc[k]
v2 := other[k]
if v1 < v2 {
less = true
} else if v1 > v2 {
greater = true
}
}
if less && !greater {
return -1 // vc < other
} else if greater && !less {
return 1 // vc > other
}
return 0 // concurrent
}
// HappensBefore checks if this clock happens before another
func (vc VectorClock) HappensBefore(other VectorClock) bool {
return vc.Compare(other) == -1
}
// HappensAfter checks if this clock happens after another
func (vc VectorClock) HappensAfter(other VectorClock) bool {
return vc.Compare(other) == 1
}
// IsConcurrent checks if two clocks are concurrent (neither happens before the other)
func (vc VectorClock) IsConcurrent(other VectorClock) bool {
return vc.Compare(other) == 0
}