Motivation
I need to maintain more than 20 physical servers at work. I connect to them via SSH and have to access each one individually to check CPU usage, disk usage, GPU usage, etc.
This process is time-consuming and doesn't scale well. I needed a better solution.
Introducing ssh-monitor
ssh-monitor is a terminal-based tool I built to solve this exact problem. It automatically discovers SSH hosts from your config and provides real-time system metrics in an interactive terminal interface.
Installation
- Start SSH agent and add your keys:
ssh-agent
ssh-add ~/.ssh/id_rsa
- Install ssh-monitor:
curl -sSL https://raw.githubusercontent.com/tsugumi-sys/ssh-monitor/main/install.sh | bash
- Run it:
ssh-monitor
Current Features
- Host Discovery: Automatically finds SSH hosts from your SSH config
- Interactive Terminal UI: Built with Ratatui for smooth navigation
- Metrics Tracked:
- CPU usage and historical timeline
- Memory utilization
- Disk usage
- GPU metrics (when available)
How It Works
Simply run ssh-monitor
and it will:
- Discover hosts from your SSH config file automatically
- Display host list - navigate with arrow keys
- View detailed metrics - press Enter to see real-time charts for a specific host
- Monitor continuously - metrics update automatically
The interface shows CPU usage timelines, memory utilization, disk space, and GPU metrics when available.
Technical Implementation: Efficient SSH Connections
One key challenge was minimizing SSH connections. A naive approach would create a new SSH connection for each metric on each server.
With 10 servers monitored every 5 seconds, that's 120 connections per minute - a big performance issue for both clients and servers.
Instead, ssh-monitor batches multiple metrics into a single SSH session:
echo __BEGIN_cpu__
bash -c '
if [[ "$(uname)" == "Darwin" ]]; then
sysctl -a;
else
lscpu;
fi
'
echo __END_cpu__
echo __BEGIN_mem__
bash -c 'uname -s && echo __MEM__ && (free -m || (echo __MAC__ && sysctl -n hw.memsize && vm_stat))'
echo __END_mem__
echo __BEGIN_disk__
df -Pm | tail -n +2
echo __END_disk__
The output uses delimiter tags, allowing the tool to parse and extract each metric from a single command execution. This reduces connections from 120 to just 30 connections per minute.