Uptime Monitoring

Tools and strategies for monitoring node uptime and maintaining high availability.

6 min read

uptimemonitoringalerts

Uptime Monitoring for Flux Nodes

Maintaining high uptime is critical for Flux node operation. Uptime directly impacts your reward frequency, PNR (Progressive Node Rewards) eligibility, and application hosting reliability. A node that goes offline — even briefly — can lose its position in the reward queue, suffer PNR penalties, and cause application disruptions. Proactive monitoring ensures you detect and resolve issues before they impact your node status.

Why Uptime Matters

•Reward Queue Position: Flux uses a deterministic reward queue. When a node goes offline, it loses its queue position and must re-enter at the back, delaying the next reward payment.
•PNR Score: Progressive Node Rewards require 97%+ uptime for eligibility. Extended downtime resets or penalizes your PNR multiplier, significantly reducing rewards.
•Application Hosting: Nodes host decentralized applications. Downtime causes app instances to be migrated, and nodes with poor uptime records are deprioritized for new deployments.
•Network Reputation: Consistent uptime builds a reliable track record that benefits your node ranking on the network.

Built-in FluxOS Monitoring

FluxOS includes a built-in status dashboard accessible via your web browser at http://YOUR_IP:16126. This dashboard displays real-time node status, current benchmark results, connected peers, deployed applications, blockchain sync status, and resource utilization. While useful for manual checks, the built-in dashboard does not provide external alerting — you need additional tools to receive notifications when problems arise.

External Monitoring Tools

1
FluxNodes.net
A community-maintained explorer that tracks all Flux nodes on the network. You can search for your node by IP address or collateral transaction to view uptime history, benchmark results, reward history, and current node status. Bookmark your node page for quick status checks.
2
NodeTracker (Zelcore)
The Zelcore wallet includes a built-in NodeTracker feature that shows the status of all nodes associated with your Zel ID. It provides at-a-glance status for operators running multiple nodes, including benchmark pass/fail, uptime, and last reward time.
3
UptimeRobot
A free external monitoring service that can ping your node FluxOS API endpoint (port 16127) at regular intervals and send alerts via email, SMS, or webhook when the endpoint becomes unreachable. Configure an HTTP monitor pointing to http://YOUR_IP:16127/flux/info.
4
Hetrixtools
Provides uptime monitoring with global check locations, blacklist monitoring (useful if your IP gets flagged), and advanced alerting options. The free tier supports monitoring multiple endpoints.

Setting Up Discord Webhook Alerts

Many Flux node operators use Discord for real-time alerts. You can create a Discord webhook in your private server channel and integrate it with monitoring services or custom scripts.

Simple Discord alert script for node status

#!/bin/bash
# flux-monitor.sh — Check FluxOS status and alert on failure

WEBHOOK_URL="https://discord.com/api/webhooks/YOUR_WEBHOOK_ID/YOUR_WEBHOOK_TOKEN"
NODE_IP="YOUR_NODE_IP"

STATUS=$(curl -s --max-time 10 "http://${NODE_IP}:16127/flux/info" | jq -r '.data.flux')

if [ "$STATUS" != "true" ] && [ "$STATUS" != "null" ]; then
  curl -H "Content-Type: application/json" \
    -d "{"content": "**ALERT:** Flux node ${NODE_IP} may be down. FluxOS API returned: ${STATUS}"}" \
    "$WEBHOOK_URL"
fi

Schedule the monitoring script with a cron job running every 5 minutes: */5 * * * * /path/to/flux-monitor.sh. Run the script from an external server, not from the node itself — if the node is down, a local script cannot send alerts.

Key Metrics to Track

Metric	Target	Why It Matters
Uptime Percentage	97%+ (ideally 99.5%+)	PNR eligibility requires consistent high uptime
Benchmark Pass Rate	100% over 7 days	Any benchmark failure triggers DOS state and reward loss
Reward Frequency	Consistent with tier average	Declining reward frequency indicates queue issues or downtime
Node Rank	Improving or stable	Rank reflects position in reward queue — lower is better (closer to next reward)
Last Reward Time	Within expected interval	Delayed rewards may indicate the node was dropped from the queue
FluxOS API Response	<2s response time	Slow API responses may indicate resource problems on the node

API-Based Monitoring

FluxOS exposes a comprehensive API that can be used for automated monitoring. Key endpoints include:

Useful FluxOS API monitoring endpoints

# Node information and status
GET http://YOUR_IP:16127/flux/info

# Current benchmark results
GET http://YOUR_IP:16127/flux/benchmarks

# Node uptime information
GET http://YOUR_IP:16127/flux/uptime

# Connected peers
GET http://YOUR_IP:16127/flux/connectedpeers

# Running applications
GET http://YOUR_IP:16127/apps/listrunningapps

Maintenance Planning

Planned maintenance is sometimes unavoidable — OS security updates, hardware upgrades, or provider migrations. To minimize impact on your uptime and PNR score:

•Schedule short maintenance windows: Aim for under 30 minutes. Brief downtime has minimal impact on PNR compared to extended outages.
•Avoid maintenance during benchmark windows: If your node is due for a benchmark, wait until after it passes before performing maintenance.
•Apply security updates promptly: Kernel and system updates can usually be applied with a quick reboot lasting 1-2 minutes.
•Use Flux multitool for FluxOS updates: The multitool script includes update options that minimize downtime during FluxOS version upgrades.

Server Migration Without Losing Uptime

When migrating to a new server, follow this process to minimize downtime:

1
Set up the new server completely
Install FluxOS on the new server and let the blockchain sync fully. Do NOT start the node yet — just have everything ready.
2
Stop the old node
Shut down FluxOS and the Flux daemon on the old server. This removes your node from the network.
3
Start the new node immediately
Configure the new server with your existing identity key and collateral details, then start the node from Zelcore. The total transition should take 5-15 minutes.
4
Verify the new node is confirmed
Monitor the FluxOS UI and node status to confirm the new node is CONFIRMED on the network.

Emergency Procedures

If your node goes down unexpectedly, act quickly. SSH into the server and check service status with pm2 list and flux-cli getinfo. If FluxOS crashed, restart with pm2 restart flux. If the daemon crashed, restart with sudo systemctl restart zelcash. If the server itself is unresponsive, contact your hosting provider. Keep your node identity key and collateral details backed up securely so you can quickly spin up a replacement server if the original cannot be recovered.

Never run two nodes with the same identity key simultaneously. This will cause conflicting announcements on the network and may result in your node being penalized or removed from the node list.

Uptime Monitoring