- Test join address reachability before attempting to join cluster
- Fall back to starting new cluster if join address is unreachable
- Add comprehensive logging for join address testing
- Prevent RQLite fatal errors when bootstrap node is down
This fixes the issue where secondary nodes fail with 'invalid join address'
when the primary bootstrap node is not accessible on port 4001.
- Replace hardcoded fallback IP with localhost for better compatibility
- Add join address format validation
- Improve logging for better troubleshooting
- Add detailed RQLite startup logging with full args
- Add automatic external IP detection for RQLite advertised addresses
- Use 0.0.0.0 for binding but actual IP for advertising to other nodes
- Add -http-adv-addr and -raft-adv-addr parameters to RQLite startup
- Resolves 'advertised HTTP address is not routable' error
- Enables proper RQLite cluster formation between nodes
- Change RQLite HTTP bind from localhost to 0.0.0.0
- Change RQLite Raft bind from localhost to 0.0.0.0
- This allows secondary bootstrap nodes and regular nodes to join the cluster
- Resolves 'invalid join address' error for secondary bootstrap nodes
CRITICAL FIX: Separate RQLite and LibP2P ports to prevent service startup failures
Changes:
- LibP2P now uses port 4000 (was conflicting with RQLite on 4001)
- RQLite continues to use port 4001 for HTTP API
- RQLite Raft uses port 4002
- Updated bootstrap peer configurations to use port 4000
- Updated install script port configurations
- Fixed firewall configuration to allow correct ports
This resolves the 'bind: address already in use' error that was preventing
the debros-node service from starting properly.
- Primary bootstrap (57.129.81.31): starts new cluster (no join address)
- Secondary bootstrap (38.242.250.186): joins primary bootstrap cluster
- Regular nodes: join primary bootstrap cluster
This allows both VPS servers to be bootstrap nodes while forming a
proper RQLite cluster where the secondary bootstrap joins the primary
instead of trying to start its own independent cluster.
Should resolve the leadership establishment timeout on the second VPS.
The actual running bootstrap node has peer ID:
12D3KooWJvJj94TmNwG1sntDWgAXi7MN3xxLLkoQzgHX6gQ22eKi
But the constants file had the wrong peer ID:
12D3KooWQRK2duw5B5LXi8gA7HBBFiCsLvwyph2ZU9VBmvbE1Nei
This mismatch was causing nodes to fail to connect to the bootstrap
node, leading to the 'invalid join address' error from RQLite.