Uptime Monitor Agent

Never miss a downtime event. This OpenClaw agent continuously monitors your websites, APIs, and services, alerting you the moment something goes down.

What It Does

The Uptime Monitor agent performs health checks on HTTP/HTTPS endpoints every 5 minutes, tracking:

Availability — Is the endpoint responding?
Response time — How fast is it?
Status codes — 200 OK, 404, 500, timeouts
SSL certificates — Expiration warnings
State changes — Only alerts when status actually changes (up→down or down→up)

It's designed to be quiet when everything is fine and loud when something breaks.

Key Features

✅ Multi-endpoint monitoring — Track unlimited URLs
✅ Smart alerting — Only notify on state changes, not every check
✅ Response time tracking — Historical performance data
✅ SSL expiration warnings — Get notified 30/7 days before cert expires
✅ Custom status page — Generate markdown status reports
✅ Configurable intervals — Default 5 min, customize per endpoint
✅ Timeout handling — Configurable timeouts (default 10s)
✅ Lightweight — Uses Claude Haiku for fast, cheap checks

Quick Start

1. Install

hatchery run @openclaw/uptime-monitor

2. Configure Endpoints

Edit memory/endpoints.json in your workspace:

{
  "endpoints": [
    {
      "name": "Production API",
      "url": "https://api.example.com/health",
      "method": "GET",
      "expectedStatus": 200,
      "timeout": 10,
      "checkInterval": 300
    },
    {
      "name": "Marketing Site",
      "url": "https://example.com",
      "method": "GET",
      "expectedStatus": 200,
      "timeout": 15,
      "checkInterval": 600
    }
  ]
}

3. Start Monitoring

The agent automatically begins checking endpoints on startup. You'll receive:

Immediate alerts when an endpoint goes down
Recovery notifications when it comes back up
SSL warnings 30 days and 7 days before expiration
Daily status summaries (optional)

Configuration

Environment Variables

None required! This agent works out of the box. Optional:

# Optional: Set your timezone for time-aware reporting
USER_TIMEZONE=America/New_York

# Optional: Slack/Discord webhook for alerts (in addition to agent DM)
ALERT_WEBHOOK_URL=https://hooks.slack.com/services/YOUR/WEBHOOK/URL

Endpoint Configuration

Each endpoint in memory/endpoints.json supports:

name (required) — Human-readable identifier
url (required) — Full URL to check
method (optional) — HTTP method, default GET
expectedStatus (optional) — Expected HTTP status code, default 200
timeout (optional) — Request timeout in seconds, default 10
checkInterval (optional) — Seconds between checks, default 300 (5 min)
headers (optional) — Custom headers object
body (optional) — Request body for POST/PUT

Alert Preferences

Edit TOOLS.md to customize:

Quiet hours — Don't alert between 11 PM - 7 AM unless critical
Alert channels — Direct message, Slack, Discord, email
Escalation rules — Who to notify after X minutes of downtime

Usage Examples

Check Current Status

You: What's the current status of all endpoints?

Agent: 📊 Uptime Status Report

✅ Production API
   https://api.example.com/health
   Status: UP (200 OK)
   Response time: 124ms
   Uptime: 99.97% (last 30 days)

✅ Marketing Site
   https://example.com
   Status: UP (200 OK)
   Response time: 456ms
   Uptime: 100.00%

🟢 All systems operational

Downtime Alert

Agent: 🚨 DOWNTIME ALERT

Production API is DOWN
https://api.example.com/health

Error: Connection timeout after 10s
Last successful check: 5 minutes ago
Previous uptime: 45 days

I'll notify you when it recovers.

Recovery Notification

Agent: ✅ RECOVERED

Production API is back UP
https://api.example.com/health

Downtime duration: 8 minutes
Status: 200 OK
Response time: 132ms

SSL Expiration Warning

Agent: ⚠️ SSL Certificate Expiring Soon

https://example.com
Certificate expires in 28 days (March 15, 2024)

Please renew before expiration to avoid downtime.

Add New Endpoint

You: Add monitoring for https://status.example.com

Agent: Added new endpoint:

✅ status.example.com
   URL: https://status.example.com
   Check interval: 5 minutes
   Timeout: 10s
   Expected status: 200

First check in progress...
✅ UP (200 OK, 234ms)

I've updated memory/endpoints.json

How It Works

Architecture

HEARTBEAT.md defines the check schedule (every 5 minutes)
skills/uptime-check/check.sh performs the actual HTTP requests
memory/endpoints.json stores endpoint configurations
memory/uptime-state.json tracks current state and history
Agent compares current state to previous state
Alerts sent only on state transitions (up→down or down→up)

State Tracking

The agent maintains state in memory/uptime-state.json:

{
  "endpoints": {
    "https://api.example.com/health": {
      "status": "up",
      "lastCheck": 1708128000,
      "lastStatusChange": 1704412800,
      "consecutiveFailures": 0,
      "uptimePercentage": 99.97,
      "responseTimeHistory": [124, 132, 118, 145]
    }
  }
}

Check Logic

# For each endpoint:
1. Load last known state from memory/uptime-state.json
2. Execute HTTP request with timeout
3. Record response time and status code
4. Compare to expected status
5. If state changed (up→down or down→up): ALERT
6. If state unchanged: Update metrics silently
7. Save new state to memory/uptime-state.json
8. Output HEARTBEAT_OK if nothing to report

Troubleshooting

Agent isn't checking endpoints

Verify memory/endpoints.json exists and is valid JSON
Check that URLs are accessible from the agent's network
Look for errors in the agent's session logs

Too many alerts

Increase checkInterval to reduce check frequency
Adjust timeout if endpoints are legitimately slow
Enable quiet hours in TOOLS.md

Missing recovery notifications

The agent only notifies on state changes
If you restarted the agent, it may have lost state
Check memory/uptime-state.json for correct state tracking

SSL warnings not appearing

SSL checks only happen once per day (not every heartbeat)
Warnings appear at 30 days and 7 days before expiration
Check that the endpoint uses HTTPS

Advanced Usage

Custom Headers (Authentication)

{
  "name": "Authenticated API",
  "url": "https://api.example.com/private",
  "headers": {
    "Authorization": "Bearer YOUR_TOKEN",
    "X-Custom-Header": "value"
  }
}

POST Health Checks

{
  "name": "POST Endpoint",
  "url": "https://api.example.com/webhook",
  "method": "POST",
  "body": "{\"ping\":\"health\"}" ,
  "expectedStatus": 200
}

Status Page Generation

You: Generate a status page

Agent: [Creates memory/status-page.md with current status of all endpoints]

Status page generated at memory/status-page.md
You can publish this to your website or share with your team.

Best Practices

✅ Start with critical endpoints — Don't monitor everything at once
✅ Set reasonable timeouts — Match your actual SLAs
✅ Use check intervals wisely — 5 min for critical, 15-30 min for non-critical
✅ Monitor health endpoints — Dedicated /health routes are better than homepage checks
✅ Test your endpoints — Make sure they're reachable from the agent's network
✅ Review weekly — Check uptime percentages and response time trends

Model & Cost

This agent uses Claude Haiku for:

⚡ Speed — Quick decisions on up/down state
💰 Cost efficiency — Checks every 5 min = ~8,640 checks/month
🎯 Appropriate complexity — Simple boolean logic doesn't need Sonnet/Opus

Estimated cost: ~$2-5/month for 10 endpoints checked every 5 minutes.

Contributing

Found a bug? Have a feature request? Open an issue or submit a PR!

License

MIT License - use freely in personal and commercial projects.

Made with OpenClaw — The self-hosted AI agent runtime.
Learn more at docs.openclaw.ai

uptime-monitor

Quick Start

Template Contents

About

Uptime Monitor Agent

What It Does

Key Features

Quick Start

1. Install

2. Configure Endpoints

3. Start Monitoring

Configuration

Environment Variables

Endpoint Configuration

Alert Preferences

Usage Examples

Check Current Status

Downtime Alert

Recovery Notification

SSL Expiration Warning

Add New Endpoint

How It Works

Architecture

State Tracking

Check Logic

Troubleshooting

Agent isn't checking endpoints

Too many alerts

Missing recovery notifications

SSL warnings not appearing

Advanced Usage

Custom Headers (Authentication)

POST Health Checks

Status Page Generation

Best Practices

Model & Cost

Contributing

License

Stats

Versions