Monitoring

Keep your ClawBook VPS running smoothly with built-in monitoring and optional external tools.

Built-in Monitoring

Health Check Command

Quick overview of system health:

clawbook-health

# Output:
# ClawBook Health Check
# =====================
# ✓ OpenClaw service: running
# ✓ PostgreSQL: running
# ✓ Caddy: running
# ✓ Disk: 45 GB / 80 GB (56%)
# ✓ Memory: 2.1 GB / 4 GB (52%)
# ✓ CPU: 15% average (2 cores)
# ✓ SSL: Valid (expires in 67 days)
#
# Status: HEALTHY

Dashboard Monitoring

View metrics in the web dashboard:

Log into your dashboard
Go to Home / Status
View real-time gauges for:
- CPU usage
- Memory usage
- Disk usage
- Network I/O
- Active connections

Service Status

Check individual services:

# All services
systemctl status openclaw postgresql caddy

# Single service
systemctl status openclaw

# Service logs
journalctl -u openclaw -f

Resource Monitoring

CPU

# Current load
top -bn1 | head -5

# CPU info
nproc  # Number of cores
lscpu  # Detailed CPU info

# Historical usage (if installed)
sar -u 1 5  # 5 samples, 1 second apart

Memory

# Current usage
free -h

# Detailed memory info
cat /proc/meminfo | head -10

# Memory usage by process
ps aux --sort=-%mem | head -10

Disk

# Disk usage
df -h

# Directory sizes
du -sh /var/log/*
du -sh /var/lib/postgresql/*

# Disk I/O
iostat -x 1 5

Network

# Current connections
ss -tulpn

# Bandwidth usage
vnstat -l  # Live monitoring

# Network statistics
netstat -i

Application Metrics

Message Statistics

View in dashboard: Home → Quick Stats

Or via CLI:

clawbook-stats messages

# Output:
# Messages (Last 24h)
# ==================
# WhatsApp: 145
# Telegram: 89
# Discord: 23
# Total: 257
#
# Messages (Last 7d): 1,523
# Messages (Last 30d): 5,891

Token Usage

clawbook-stats tokens

# Output:
# Token Usage (Current Month)
# ===========================
# Provider: Anthropic
# Input tokens: 125,000
# Output tokens: 89,000
# Total cost: $2.14

Response Times

clawbook-stats performance

# Output:
# Response Times (Last 24h)
# =========================
# Average: 1.2s
# P50: 0.9s
# P95: 2.5s
# P99: 4.1s

Log Monitoring

Log Locations

Log	Location	Purpose
Application	`/var/log/openclaw/app.log`	Main application logs
Access	`/var/log/openclaw/access.log`	HTTP requests
Error	`/var/log/openclaw/error.log`	Errors only
Backup	`/var/log/openclaw/backup.log`	Backup operations
System	`/var/log/syslog`	System events

Real-time Log Viewing

# Application logs
tail -f /var/log/openclaw/app.log

# All ClawBook logs
tail -f /var/log/openclaw/*.log

# Filtered errors
tail -f /var/log/openclaw/app.log | grep ERROR

Log Rotation

Logs are automatically rotated:

cat /etc/logrotate.d/openclaw

Default settings:

Rotate daily
Keep 7 days
Compress after 1 day

Alerting

Built-in Alerts

Configure in dashboard: Settings → Notifications

Alert	Threshold	Action
High CPU	> 90% for 5 min	Email
High Memory	> 90%	Email
Disk Full	> 85%	Email
Service Down	Any service	Email
SSL Expiring	< 14 days	Email
Backup Failed	Any failure	Email

Alert Configuration

# /etc/openclaw/config.yaml
alerts:
  email:
    enabled: true
    recipients:
      - admin@example.com

  thresholds:
    cpu_percent: 90
    memory_percent: 90
    disk_percent: 85
    ssl_days: 14

  cooldown_minutes: 30  # Don't repeat alerts

Slack Notifications

alerts:
  slack:
    enabled: true
    webhook_url: https://hooks.slack.com/services/xxx/yyy/zzz
    channel: "#alerts"

External Monitoring

UptimeRobot (Free)

Sign up at uptimerobot.com
Add new monitor:
- Type: HTTPS
- URL: https://yourdomain.com
- Interval: 5 minutes
Configure alert contacts

Datadog

Professional monitoring with detailed metrics:

# Install agent
DD_API_KEY=your_api_key bash -c "$(curl -L https://s3.amazonaws.com/dd-agent/scripts/install_script.sh)"

# Configure
nano /etc/datadog-agent/datadog.yaml

Prometheus + Grafana

For self-hosted monitoring:

# /etc/openclaw/config.yaml
metrics:
  prometheus:
    enabled: true
    port: 9090
    path: /metrics

Access metrics at http://localhost:9090/metrics

Status Page

Public Status Page

Create a status page for users:

Use Statuspage.io (paid)
Or Cachet (self-hosted, free)
Or Upptime (GitHub-based, free)

Integration

Automatically update status page:

# /etc/openclaw/config.yaml
status_page:
  provider: statuspage.io
  page_id: your_page_id
  api_key: your_api_key
  component_id: your_component_id

Monitoring Best Practices

Check daily - Review dashboard or health command
Set up alerts - Don't wait for problems
Monitor externally - Catch issues you can't see internally
Review logs weekly - Look for patterns
Track trends - Storage growth, message volume
Plan capacity - Upgrade before hitting limits

Troubleshooting High Resource Usage

High CPU

# Find culprit
top -c

# Common causes:
# - High message volume
# - Inefficient AI responses
# - Background processes

# Solution:
# - Upgrade plan
# - Rate limit users
# - Optimize settings

High Memory

# Find memory hogs
ps aux --sort=-%mem | head -10

# Common causes:
# - Large conversation contexts
# - Memory leaks
# - PostgreSQL buffers

# Solution:
# - Reduce context window
# - Restart services periodically
# - Upgrade plan

Disk Full

# Find large directories
du -sh /* | sort -rh | head -10

# Clean up
apt autoremove -y
journalctl --vacuum-time=7d
clawbook-cleanup logs --older-than 30d

Next Steps

Troubleshooting - Fix issues
Performance - Optimize speed
Backup & Restore - Protect your data

Built-in Monitoring​

Health Check Command​

Dashboard Monitoring​

Service Status​

Resource Monitoring​

CPU​

Memory​

Disk​

Network​

Application Metrics​

Message Statistics​

Token Usage​

Response Times​

Log Monitoring​

Log Locations​

Real-time Log Viewing​

Log Rotation​

Alerting​

Built-in Alerts​

Alert Configuration​

Slack Notifications​

External Monitoring​

UptimeRobot (Free)​

Datadog​

Prometheus + Grafana​

Status Page​

Public Status Page​

Integration​

Monitoring Best Practices​

Troubleshooting High Resource Usage​

High CPU​

High Memory​

Disk Full​

Next Steps​