uptime-monitor
Continuous HTTP/HTTPS uptime monitoring with instant downtime alerts
Quick Start
Template Contents
Browse files before installing this template.
About
Uptime Monitor Agent
Never miss a downtime event. This OpenClaw agent continuously monitors your websites, APIs, and services, alerting you the moment something goes down.
What It Does
The Uptime Monitor agent performs health checks on HTTP/HTTPS endpoints every 5 minutes, tracking:
- Availability — Is the endpoint responding?
- Response time — How fast is it?
- Status codes — 200 OK, 404, 500, timeouts
- SSL certificates — Expiration warnings
- State changes — Only alerts when status actually changes (up→down or down→up)
It's designed to be quiet when everything is fine and loud when something breaks.
Key Features
✅ Multi-endpoint monitoring — Track unlimited URLs
✅ Smart alerting — Only notify on state changes, not every check
✅ Response time tracking — Historical performance data
✅ SSL expiration warnings — Get notified 30/7 days before cert expires
✅ Custom status page — Generate markdown status reports
✅ Configurable intervals — Default 5 min, customize per endpoint
✅ Timeout handling — Configurable timeouts (default 10s)
✅ Lightweight — Uses Claude Haiku for fast, cheap checks
Quick Start
1. Install
hatchery run @openclaw/uptime-monitor
2. Configure Endpoints
Edit memory/endpoints.json in your workspace:
{
"endpoints": [
{
"name": "Production API",
"url": "https://api.example.com/health",
"method": "GET",
"expectedStatus": 200,
"timeout": 10,
"checkInterval": 300
},
{
"name": "Marketing Site",
"url": "https://example.com",
"method": "GET",
"expectedStatus": 200,
"timeout": 15,
"checkInterval": 600
}
]
}
3. Start Monitoring
The agent automatically begins checking endpoints on startup. You'll receive:
- Immediate alerts when an endpoint goes down
- Recovery notifications when it comes back up
- SSL warnings 30 days and 7 days before expiration
- Daily status summaries (optional)
Configuration
Environment Variables
None required! This agent works out of the box. Optional:
# Optional: Set your timezone for time-aware reporting
USER_TIMEZONE=America/New_York
# Optional: Slack/Discord webhook for alerts (in addition to agent DM)
ALERT_WEBHOOK_URL=https://hooks.slack.com/services/YOUR/WEBHOOK/URL
Endpoint Configuration
Each endpoint in memory/endpoints.json supports:
- name (required) — Human-readable identifier
- url (required) — Full URL to check
- method (optional) — HTTP method, default
GET - expectedStatus (optional) — Expected HTTP status code, default
200 - timeout (optional) — Request timeout in seconds, default
10 - checkInterval (optional) — Seconds between checks, default
300(5 min) - headers (optional) — Custom headers object
- body (optional) — Request body for POST/PUT
Alert Preferences
Edit TOOLS.md to customize:
- Quiet hours — Don't alert between 11 PM - 7 AM unless critical
- Alert channels — Direct message, Slack, Discord, email
- Escalation rules — Who to notify after X minutes of downtime
Usage Examples
Check Current Status
You: What's the current status of all endpoints?
Agent: 📊 Uptime Status Report
✅ Production API
https://api.example.com/health
Status: UP (200 OK)
Response time: 124ms
Uptime: 99.97% (last 30 days)
✅ Marketing Site
https://example.com
Status: UP (200 OK)
Response time: 456ms
Uptime: 100.00%
🟢 All systems operational
Downtime Alert
Agent: 🚨 DOWNTIME ALERT
Production API is DOWN
https://api.example.com/health
Error: Connection timeout after 10s
Last successful check: 5 minutes ago
Previous uptime: 45 days
I'll notify you when it recovers.
Recovery Notification
Agent: ✅ RECOVERED
Production API is back UP
https://api.example.com/health
Downtime duration: 8 minutes
Status: 200 OK
Response time: 132ms
SSL Expiration Warning
Agent: ⚠️ SSL Certificate Expiring Soon
https://example.com
Certificate expires in 28 days (March 15, 2024)
Please renew before expiration to avoid downtime.
Add New Endpoint
You: Add monitoring for https://status.example.com
Agent: Added new endpoint:
✅ status.example.com
URL: https://status.example.com
Check interval: 5 minutes
Timeout: 10s
Expected status: 200
First check in progress...
✅ UP (200 OK, 234ms)
I've updated memory/endpoints.json
How It Works
Architecture
- HEARTBEAT.md defines the check schedule (every 5 minutes)
- skills/uptime-check/check.sh performs the actual HTTP requests
- memory/endpoints.json stores endpoint configurations
- memory/uptime-state.json tracks current state and history
- Agent compares current state to previous state
- Alerts sent only on state transitions (up→down or down→up)
State Tracking
The agent maintains state in memory/uptime-state.json:
{
"endpoints": {
"https://api.example.com/health": {
"status": "up",
"lastCheck": 1708128000,
"lastStatusChange": 1704412800,
"consecutiveFailures": 0,
"uptimePercentage": 99.97,
"responseTimeHistory": [124, 132, 118, 145]
}
}
}
Check Logic
# For each endpoint:
1. Load last known state from memory/uptime-state.json
2. Execute HTTP request with timeout
3. Record response time and status code
4. Compare to expected status
5. If state changed (up→down or down→up): ALERT
6. If state unchanged: Update metrics silently
7. Save new state to memory/uptime-state.json
8. Output HEARTBEAT_OK if nothing to report
Troubleshooting
Agent isn't checking endpoints
- Verify
memory/endpoints.jsonexists and is valid JSON - Check that URLs are accessible from the agent's network
- Look for errors in the agent's session logs
Too many alerts
- Increase
checkIntervalto reduce check frequency - Adjust
timeoutif endpoints are legitimately slow - Enable quiet hours in
TOOLS.md
Missing recovery notifications
- The agent only notifies on state changes
- If you restarted the agent, it may have lost state
- Check
memory/uptime-state.jsonfor correct state tracking
SSL warnings not appearing
- SSL checks only happen once per day (not every heartbeat)
- Warnings appear at 30 days and 7 days before expiration
- Check that the endpoint uses HTTPS
Advanced Usage
Custom Headers (Authentication)
{
"name": "Authenticated API",
"url": "https://api.example.com/private",
"headers": {
"Authorization": "Bearer YOUR_TOKEN",
"X-Custom-Header": "value"
}
}
POST Health Checks
{
"name": "POST Endpoint",
"url": "https://api.example.com/webhook",
"method": "POST",
"body": "{\"ping\":\"health\"}" ,
"expectedStatus": 200
}
Status Page Generation
You: Generate a status page
Agent: [Creates memory/status-page.md with current status of all endpoints]
Status page generated at memory/status-page.md
You can publish this to your website or share with your team.
Best Practices
✅ Start with critical endpoints — Don't monitor everything at once
✅ Set reasonable timeouts — Match your actual SLAs
✅ Use check intervals wisely — 5 min for critical, 15-30 min for non-critical
✅ Monitor health endpoints — Dedicated /health routes are better than homepage checks
✅ Test your endpoints — Make sure they're reachable from the agent's network
✅ Review weekly — Check uptime percentages and response time trends
Model & Cost
This agent uses Claude Haiku for:
- ⚡ Speed — Quick decisions on up/down state
- 💰 Cost efficiency — Checks every 5 min = ~8,640 checks/month
- 🎯 Appropriate complexity — Simple boolean logic doesn't need Sonnet/Opus
Estimated cost: ~$2-5/month for 10 endpoints checked every 5 minutes.
Contributing
Found a bug? Have a feature request? Open an issue or submit a PR!
License
MIT License - use freely in personal and commercial projects.
Made with OpenClaw — The self-hosted AI agent runtime.
Learn more at docs.openclaw.ai