Use recovery procedures when a node is stuck, corrupted, out of disk, badly lagged, or unable to reconnect to peers.
Recovery paths
| Situation | First response |
|---|
| Node is far behind | Check peers and consider state sync. |
| State sync failed | Reset and retry with a fresh trusted height and hash. |
| Disk is full | Stop the service, increase disk capacity, and restart after confirming data integrity. |
| Config is invalid | Restore the last known-good config and validate network-specific values. |
| Validator risk | Stop and follow validator-specific key and slashing safety procedures. |
Reset and retry
For a non-validator node that can safely discard local block data, stop the process and reset the local chain database before retrying state sync or normal block sync. This targeted reset keeps the managed install and node home. Use the node home for the machine you are repairing.
ctmd comet unsafe-reset-all --home ~/.ctmd
Linux service installs usually use /var/lib/ctmd instead of ~/.ctmd:
sudo systemctl stop ctmd.service
sudo -u ctmd ctmd comet unsafe-reset-all --home /var/lib/ctmd
sudo systemctl start ctmd.service
Do not run reset commands on validator infrastructure without confirming key safety, double-signing risk, and whether state can be safely discarded.
deploy.sh reset is more destructive than unsafe-reset-all: it removes the managed install, node home, service/cache, and Linux system user/group so a clean install can start over. Use it only when you intentionally want to discard the managed node home and local key material.
After recovery
After any reset or disk repair, confirm the node is advancing and connected:
curl -s http://127.0.0.1:26657/status | jq '.result.sync_info'
./deploy.sh status
If the node remains far behind, use State Sync for non-archive nodes or check Full Node Block Delivery for peer connectivity and block propagation issues.