Initial commit

This commit is contained in:
2026-03-23 18:28:29 -04:00
commit 3cd36feb0b
12 changed files with 2244 additions and 0 deletions

4
.gitignore vendored Normal file
View File

@@ -0,0 +1,4 @@
# ZPulse - Developed by acidvegas in Python (https://github.com/acidvegas/rackwatch)
# zpulse/.gitignore
venv/

BIN
.screens/preview.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 94 KiB

BIN
.screens/preview2.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 740 KiB

91
README.md Normal file
View File

@@ -0,0 +1,91 @@
# ZPulse
Real-time ZFS & disk monitoring for home server racks. Built to watch over multiple nodes with dozens of drives, streaming SMART health, ZFS pool status, I/O rates, and temperatures to a single dashboard over WebSocket. Alerts go to Gotify.
## Why
Every time I looked for a way to keep tabs on disk health across all the nodes in my home rack, the answer was always the same stack: Grafana, Telegraf, Prometheus, maybe throw InfluxDB in there too. That is an absurd amount of infrastructure just to answer "are my drives dying?" I didn't need time-series databases and query languages and dashboarding frameworks. I needed something that tells me if a disk is getting hot, if a ZFS pool is degraded, or if SMART errors are creeping up, across every machine, in one place.
Nothing out there was built for this. Everything either does way too much or only monitors the local machine. So I wrote ZPulse. It is purpose-built for home racks: lightweight agents that stream disk and ZFS telemetry over a single WebSocket connection to one central dashboard. No metric pipelines, no config files longer than the code itself, no containers, no databases. Just a Python agent on each node and a dashboard on whatever box you have lying around.
## Preview
![](./.screens/preview.png)
![](./.screens/preview2.png)
## Architecture
```
[Server 1] agent.py ──WebSocket──┐
[Server 2] agent.py ──WebSocket──┼──> [Central Box] dashboard.py <──WS──> [Browser]
[Server N] agent.py ──WebSocket──┘
```
- `agent.py` runs on each server as root, collects disk/ZFS/SMART/I/O data, & streams it to the dashboard
- `dashboard.py` runs on a central machine *(Raspberry Pi, NUC, whatever)*, aggregates data from all agents, & serves the web UI
- All data flows over persistent WebSocket connections, no polling
## Dashboard Setup
Installs to `/opt/zpulse-dashboard`. No root required at runtime, just for the setup itself.
```bash
sudo ./dashboard/setup.sh
```
This installs dependencies, creates a venv, fetches Chart.js, and sets up a systemd service. The dashboard listens on port 8888 by default.
To run manually instead:
```bash
cd dashboard
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python3 dashboard.py
```
| Flag | Default | Description |
|-----------------|-----------|----------------|
| `-h`, `--host` | `0.0.0.0` | Listen address |
| `-p`, `--port` | `8888` | Listen port |
| `-d`, `--debug` | off | Debug logging |
## Agent Setup
Installs to `/opt/zpulse-agent`. Must run as root for SMART data & ZFS access.
```bash
sudo ./agent/setup.sh ws://DASHBOARD_IP:8888/ws/agent
```
This installs `smartmontools` and `zfsutils-linux`, creates a venv, and sets up a systemd service that auto-starts and reconnects.
To run manually instead:
```bash
cd agent
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
sudo ./venv/bin/python agent.py ws://DASHBOARD_IP:8888/ws/agent
```
## Gotify Notifications
Open the dashboard in a browser, click Settings. Enter your Gotify server URL and app token, hit Test, then Save. Alert thresholds for temperature, space usage, SMART failures, and pool health are all configured from the same panel.
## What It Monitors
- Fleet overview with all connected servers, health status, storage usage, alert counts
- Per-server detail view:
- System info *(kernel, ZFS version, uptime, RAM, CPU)*
- ZFS pools *(size, allocated, free, fragmentation, dedup, scrub age, vdev tree, errors)*
- ZFS datasets *(used, available, referenced, compression ratio, quotas)*
- Snapshots *(name, used, referenced, creation time)*
- Live I/O charts *(per-disk throughput, IOPS, read/write latency, busy%)*
- ARC cache stats *(size, hit rate, MRU/MFU, L2ARC)*
- Temperature charts *(per-disk, live)*
- Disk details *(model, serial, firmware, capacity, RPM, protocol, health score 0-100, full SMART attributes, SAS error counters, grown defects)*
- SMART self-test triggering from the UI
- Alerts pushed to Gotify with configurable cooldowns

801
agent/agent.py Normal file
View File

@@ -0,0 +1,801 @@
#!/usr/bin/env python3
# ZPulse - Developed by acidvegas in Python (https://github.com/acidvegas/rackwatch)
# zpulse/agent/agent.py
import argparse
import asyncio
import json
import logging
import os
import re
import shutil
import socket
import subprocess
import threading
import time
from concurrent.futures import ThreadPoolExecutor, as_completed
from datetime import datetime
from pathlib import Path
try:
import apv
except ImportError:
raise ImportError('missing apv module (pip install apv)')
try:
import websockets
except ImportError:
raise ImportError('missing websockets module (pip install websockets)')
# ── Configuration ────────────────────────────────────────────────────────────
SMART_INTERVAL = 300
POOL_INTERVAL = 30
IO_INTERVAL = 3
HEALTH_PENALTIES = {
5: (30, 3), # Reallocated Sectors
10: (15, 5), # Spin Retry Count
187: (20, 2), # Reported Uncorrectable Errors
188: (10, 1), # Command Timeout
196: (10, 2), # Reallocation Event Count
197: (30, 5), # Current Pending Sector Count
198: (30, 5), # Offline Uncorrectable
199: (10, 1), # UDMA CRC Error Count
}
# ── Global State ─────────────────────────────────────────────────────────────
lock = threading.Lock()
init_done = threading.Event()
capabilities = {'smartctl': False, 'zfs': False, 'hostname': socket.gethostname()}
cache = {
'disks' : [],
'pools' : [],
'datasets' : [],
'snapshots' : [],
'io_rates' : {},
'arc' : None,
'pool_map' : {},
'temps' : {},
'system_info' : {},
}
_prev_diskstats = {}
_prev_arcstats = {}
# ── Helpers ──────────────────────────────────────────────────────────────────
def run_cmd(cmd: list[str], timeout: int = 30):
'''
Run a shell command and return stdout, stderr, and return code.
:param cmd: Command and arguments to execute
:param timeout: Maximum seconds to wait before killing the process
'''
try:
r = subprocess.run(cmd, capture_output=True, text=True, timeout=timeout)
return r.stdout, r.stderr, r.returncode
except subprocess.TimeoutExpired:
return '', 'timeout', -1
except FileNotFoundError:
return '', f'{cmd[0]} not found', -2
except Exception as e:
return '', str(e), -3
def detect_capabilities():
'''Check system for smartctl and zpool availability.'''
capabilities['smartctl'] = shutil.which('smartctl') is not None
capabilities['zfs'] = shutil.which('zpool') is not None
# ── System Info ──────────────────────────────────────────────────────────────
def collect_system_info():
'''Collect hostname, kernel, ZFS version, uptime, RAM, and CPU info from /proc and /sys.'''
info = {
'hostname' : capabilities['hostname'],
'kernel' : '',
'zfs_version' : '',
'uptime_seconds' : 0,
'ram_total' : 0,
'ram_available' : 0,
'cpu_model' : '',
'cpu_count' : os.cpu_count() or 0,
}
out, _, rc = run_cmd(['uname', '-r'])
if rc == 0:
info['kernel'] = out.strip()
try:
with open('/sys/module/zfs/version') as f:
info['zfs_version'] = f.read().strip()
except FileNotFoundError:
out, _, rc = run_cmd(['modinfo', '-F', 'version', 'zfs'])
if rc == 0:
info['zfs_version'] = out.strip()
except Exception:
pass
try:
with open('/proc/uptime') as f:
info['uptime_seconds'] = float(f.read().split()[0])
except Exception:
pass
try:
with open('/proc/meminfo') as f:
for line in f:
if line.startswith('MemTotal:'):
info['ram_total'] = int(line.split()[1]) * 1024
elif line.startswith('MemAvailable:'):
info['ram_available'] = int(line.split()[1]) * 1024
except Exception:
pass
try:
with open('/proc/cpuinfo') as f:
for line in f:
if line.startswith('model name'):
info['cpu_model'] = line.split(':', 1)[1].strip()
break
except Exception:
pass
return info
# ── Health Score ─────────────────────────────────────────────────────────────
def compute_health_score(disk: dict):
'''
Compute a 0-100 health score based on SMART attributes, temperature, power-on hours, and defects.
:param disk: Disk info dict with smart_attributes, temperature, power_on_hours, etc.
'''
score = 100.0
for attr in disk.get('smart_attributes', []):
raw = attr.get('raw', 0)
if not isinstance(raw, (int, float)):
try:
raw = int(str(raw).replace(',', ''))
except (ValueError, TypeError):
raw = 0
penalty = HEALTH_PENALTIES.get(attr.get('id', 0))
if penalty and raw > 0:
score -= min(penalty[0], raw * penalty[1])
if attr.get('when_failed'):
score -= 20
temp = disk.get('temperature')
if temp and temp > 50:
score -= (temp - 50) * 2
poh = disk.get('power_on_hours')
if poh and poh > 40000:
score -= min(15, (poh - 40000) // 5000)
gdc = disk.get('grown_defect_count')
if isinstance(gdc, (int, float)) and gdc > 0:
score -= min(30, int(gdc) * 3)
if disk.get('health') is False:
score = min(score, 10)
return max(0, min(100, int(score)))
# ── Fast Temperature ─────────────────────────────────────────────────────────
def collect_temps_fast():
'''Read disk temperatures from /sys/block hwmon entries without shelling out.'''
temps = {}
try:
for name in os.listdir('/sys/block'):
if not re.match(r'^(sd[a-z]+|nvme\d+n\d+)$', name):
continue
for base in [Path(f'/sys/block/{name}/device/hwmon'), Path(f'/sys/block/{name}/device')]:
if not base.exists():
continue
found = False
try:
entries = sorted(base.iterdir())
except OSError:
continue
for item in entries:
if not item.name.startswith('hwmon'):
continue
tf = item / 'temp1_input'
if tf.exists():
try:
with open(tf) as f:
temps[name] = int(f.read().strip()) // 1000
found = True
except (ValueError, OSError):
pass
break
if found:
break
except OSError:
pass
return temps
# ── Disk & SMART Collection ─────────────────────────────────────────────────
def collect_smart(device: str):
'''
Run smartctl -j -a on a device and parse the JSON output.
:param device: Block device path (e.g. /dev/sda)
'''
out, _, _ = run_cmd(['smartctl', '-j', '-a', device], timeout=30)
try:
data = json.loads(out)
except (json.JSONDecodeError, ValueError):
return {'smart_available': False}
info = {
'smart_available' : data.get('smart_support', {}).get('available', False),
'smart_enabled' : data.get('smart_support', {}).get('enabled', False),
'health' : data.get('smart_status', {}).get('passed'),
'temperature' : data.get('temperature', {}).get('current'),
'power_on_hours' : data.get('power_on_time', {}).get('hours'),
'model_family' : data.get('model_family', ''),
'firmware' : data.get('firmware_version', ''),
'device_model' : data.get('model_name', ''),
'user_capacity' : data.get('user_capacity', {}).get('bytes', 0),
'rotation_rate' : data.get('rotation_rate', 0),
'form_factor' : data.get('form_factor', {}).get('name', ''),
'protocol' : data.get('device', {}).get('protocol', 'Unknown'),
'sata_version' : data.get('sata_version', {}).get('string', ''),
'smart_attributes' : [],
'sas_error_counters' : None,
'grown_defect_count' : None,
}
if 'ata_smart_attributes' in data:
for attr in data['ata_smart_attributes'].get('table', []):
info['smart_attributes'].append({
'id' : attr.get('id', 0),
'name' : attr.get('name', ''),
'value' : attr.get('value', 0),
'worst' : attr.get('worst', 0),
'threshold' : attr.get('thresh', 0),
'raw' : attr.get('raw', {}).get('value', 0),
'flags' : attr.get('flags', {}).get('string', ''),
'when_failed': attr.get('when_failed', ''),
})
if 'scsi_error_counter_log' in data:
info['sas_error_counters'] = data['scsi_error_counter_log']
if 'scsi_grown_defect_list' in data:
info['grown_defect_count'] = data['scsi_grown_defect_list']
return info
def collect_disks():
'''Enumerate physical disks via lsblk and collect SMART data in parallel.'''
out, _, rc = run_cmd(['lsblk', '-d', '-b', '-o', 'NAME,SIZE,MODEL,SERIAL,ROTA,TRAN,TYPE', '-J'])
if rc != 0:
return []
try:
data = json.loads(out)
except json.JSONDecodeError:
return []
pool_map = collect_pool_mapping()
devs = []
for dev in data.get('blockdevices', []):
if dev.get('type') != 'disk':
continue
name = dev['name']
devs.append({
'name' : name,
'path' : f'/dev/{name}',
'size' : int(dev.get('size') or 0),
'model' : (dev.get('model') or '').strip() or 'Unknown',
'serial' : (dev.get('serial') or '').strip() or 'Unknown',
'rotational' : bool(dev.get('rota')),
'transport' : dev.get('tran') or 'unknown',
'pool' : pool_map.get(name, ''),
})
if capabilities['smartctl'] and devs:
with ThreadPoolExecutor(max_workers=min(8, len(devs))) as executor:
futures = {executor.submit(collect_smart, d['path']): d for d in devs}
for future in as_completed(futures):
disk = futures[future]
try:
disk.update(future.result(timeout=45))
except Exception as e:
logging.warning('SMART failed for %s: %s', disk['name'], e)
for d in devs:
d['health_score'] = compute_health_score(d)
with lock:
cache['pool_map'] = pool_map
return devs
# ── ZFS Collection ───────────────────────────────────────────────────────────
def collect_pool_mapping():
'''Parse zpool status to build a device name to pool name mapping.'''
if not capabilities['zfs']:
return {}
mapping = {}
out, _, rc = run_cmd(['zpool', 'status', '-L'])
if rc != 0:
out, _, rc = run_cmd(['zpool', 'status'])
if rc != 0:
return {}
current_pool = None
for line in out.splitlines():
m = re.match(r'\s*pool:\s+(\S+)', line)
if m:
current_pool = m.group(1)
continue
if current_pool:
m2 = re.match(r'\s+(/dev/)?(\S+)\s+(ONLINE|DEGRADED|FAULTED|OFFLINE|UNAVAIL|REMOVED)', line)
if m2:
dev = m2.group(2)
if dev == current_pool or dev.endswith(':'):
continue
if re.match(r'^(mirror|raidz[123]?|spare|log|cache|special|replacing)(-\d+)?$', dev):
continue
dev = os.path.basename(os.path.realpath(f'/dev/{dev}')) if os.path.exists(f'/dev/{dev}') else dev
dev = re.sub(r'-part\d+$', '', dev)
dev = re.sub(r'p\d+$', '', dev) if re.match(r'^nvme\d+n\d+p\d+$', dev) else dev
dev = re.sub(r'\d+$', '', dev) if re.match(r'^sd[a-z]+\d+$', dev) else dev
mapping[dev] = current_pool
return mapping
def parse_scrub_age(scan_text: str):
'''
Extract the number of days since the last scrub from zpool scan text.
:param scan_text: The scan line from zpool status output
'''
if not scan_text or 'scrub' not in scan_text.lower():
return None
m = re.search(r'on\s+\w+\s+(\w+\s+\d+\s+[\d:]+\s+\d{4})', scan_text)
if not m:
return None
try:
return (datetime.now() - datetime.strptime(m.group(1), '%b %d %H:%M:%S %Y')).days
except (ValueError, TypeError):
return None
def collect_pools():
'''Collect ZFS pool stats, vdev tree, scrub info, and error summary.'''
if not capabilities['zfs']:
return []
out, _, rc = run_cmd(['zpool', 'list', '-Hp', '-o', 'name,size,alloc,free,frag,cap,dedup,health,ashift'])
if rc != 0:
return []
pools = []
for line in out.strip().splitlines():
if not line.strip():
continue
p = line.split('\t')
if len(p) < 8:
continue
pool = {
'name' : p[0],
'size' : int(p[1]),
'allocated' : int(p[2]),
'free' : int(p[3]),
'fragmentation' : int(p[4]) if p[4] != '-' else 0,
'capacity_pct' : int(p[5]) if p[5] != '-' else 0,
'dedup' : float(p[6].rstrip('x')) if p[6] != '-' else 1.0,
'health' : p[7],
'ashift' : int(p[8]) if len(p) > 8 and p[8] != '-' else 0,
'scan' : '',
'vdevs' : [],
'errors_summary': '',
'scrub_age_days': None,
}
s_out, _, _ = run_cmd(['zpool', 'status', '-L', pool['name']])
if s_out:
pool['scan'], pool['vdevs'], pool['errors_summary'] = parse_pool_status(s_out)
pool['scrub_age_days'] = parse_scrub_age(pool['scan'])
pools.append(pool)
return pools
def parse_pool_status(text: str):
'''
Parse raw zpool status output into scan text, vdev list, and error summary.
:param text: Raw output from zpool status
'''
scan_lines = []
vdevs = []
errors_summary = ''
in_scan = False
in_config = False
for line in text.splitlines():
if line.strip().startswith('scan:'):
in_scan = True
scan_lines.append(line.split('scan:', 1)[1].strip())
continue
if in_scan:
if line.startswith('\t') and not line.strip().startswith('NAME'):
scan_lines.append(line.strip())
else:
in_scan = False
if line.strip().startswith('NAME') and 'STATE' in line:
in_config = True
continue
if in_config:
if not line.strip() or line.strip().startswith('errors:'):
in_config = False
if line.strip().startswith('errors:'):
errors_summary = line.split('errors:', 1)[1].strip()
continue
parts = line.split()
if len(parts) >= 2:
vdevs.append({
'name' : parts[0],
'state' : parts[1] if len(parts) > 1 else '',
'read' : parts[2] if len(parts) > 2 else '0',
'write' : parts[3] if len(parts) > 3 else '0',
'cksum' : parts[4] if len(parts) > 4 else '0',
'indent': len(line) - len(line.lstrip('\t')),
})
return ' '.join(scan_lines), vdevs, errors_summary
def collect_datasets_and_snapshots():
'''Collect all ZFS datasets and snapshots in a single zfs list call.'''
if not capabilities['zfs']:
return [], []
out, _, rc = run_cmd(['zfs', 'list', '-t', 'all', '-Hp', '-o', 'name,used,avail,refer,mountpoint,compression,compressratio,recordsize,type,quota,reservation,creation', '-s', 'creation'])
if rc != 0:
return [], []
datasets = []
snapshots = []
for line in out.strip().splitlines():
if not line.strip():
continue
p = line.split('\t')
if len(p) < 9:
continue
if p[8] == 'snapshot':
try:
creation = int(p[11]) if len(p) > 11 else 0
except (ValueError, TypeError):
creation = p[11] if len(p) > 11 else 0
snapshots.append({
'name' : p[0],
'used' : int(p[1]) if p[1] != '-' else 0,
'referenced' : int(p[3]) if p[3] != '-' else 0,
'creation' : creation,
})
else:
datasets.append({
'name' : p[0],
'used' : int(p[1]) if p[1] != '-' else 0,
'available' : int(p[2]) if p[2] != '-' else 0,
'referenced' : int(p[3]) if p[3] != '-' else 0,
'mountpoint' : p[4] if p[4] != '-' else '',
'compression' : p[5] if p[5] != '-' else 'off',
'compressratio' : p[6] if p[6] != '-' else '1.00x',
'recordsize' : int(p[7]) if p[7] not in ('-', '') else 0,
'type' : p[8],
'quota' : int(p[9]) if len(p) > 9 and p[9] not in ('-', '0', 'none', '') else 0,
'reservation' : int(p[10]) if len(p) > 10 and p[10] not in ('-', '0', 'none', '') else 0,
})
return datasets, snapshots
# ── I/O & ARC Collection ────────────────────────────────────────────────────
def collect_iostat():
'''Read /proc/diskstats and compute per-disk I/O rates from deltas.'''
global _prev_diskstats
current = {}
now = time.time()
try:
with open('/proc/diskstats') as f:
for line in f:
parts = line.split()
if len(parts) < 14:
continue
name = parts[2]
if not re.match(r'^(sd[a-z]+|nvme\d+n\d+|dm-\d+|vd[a-z]+|xvd[a-z]+)$', name):
continue
current[name] = {
'read_ios' : int(parts[3]),
'read_sectors' : int(parts[5]),
'read_ticks' : int(parts[6]),
'write_ios' : int(parts[7]),
'write_sectors' : int(parts[9]),
'write_ticks' : int(parts[10]),
'io_ticks' : int(parts[12]) if len(parts) > 12 else 0,
'ts' : now,
}
except FileNotFoundError:
return {}
rates = {}
if _prev_diskstats:
for name, cur in current.items():
prev = _prev_diskstats.get(name)
if not prev:
continue
dt = cur['ts'] - prev['ts']
if dt <= 0:
continue
d_rio = cur['read_ios'] - prev['read_ios']
d_wio = cur['write_ios'] - prev['write_ios']
rates[name] = {
'read_bps' : (cur['read_sectors'] - prev['read_sectors']) * 512 / dt,
'write_bps' : (cur['write_sectors'] - prev['write_sectors']) * 512 / dt,
'read_iops' : d_rio / dt,
'write_iops' : d_wio / dt,
'read_lat_ms' : (cur['read_ticks'] - prev['read_ticks']) / d_rio if d_rio > 0 else 0,
'write_lat_ms' : (cur['write_ticks'] - prev['write_ticks']) / d_wio if d_wio > 0 else 0,
'busy_pct' : min(100, (cur['io_ticks'] - prev['io_ticks']) / (dt * 10)),
}
_prev_diskstats = current
return rates
def collect_arc_stats():
'''Read /proc/spl/kstat/zfs/arcstats and compute ARC hit rates.'''
global _prev_arcstats
raw = {}
try:
with open('/proc/spl/kstat/zfs/arcstats') as f:
for line in f:
parts = line.split()
if len(parts) >= 3:
try:
raw[parts[0]] = int(parts[2])
except ValueError:
pass
except FileNotFoundError:
return None
if not raw:
return None
hits = raw.get('hits', 0)
misses = raw.get('misses', 0)
total = hits + misses
lifetime_rate = round(hits / total * 100, 2) if total > 0 else 0
if _prev_arcstats:
dh = hits - _prev_arcstats.get('hits', 0)
dm = misses - _prev_arcstats.get('misses', 0)
dt = dh + dm
hit_rate = round(dh / dt * 100, 2) if dt > 0 else lifetime_rate
else:
hit_rate = lifetime_rate
_prev_arcstats = {'hits': hits, 'misses': misses}
return {
'size' : raw.get('size', 0),
'max_size' : raw.get('c_max', 0),
'min_size' : raw.get('c_min', 0),
'target_size' : raw.get('c', 0),
'hits' : hits,
'misses' : misses,
'hit_rate' : hit_rate,
'lifetime_hit_rate' : lifetime_rate,
'mru_size' : raw.get('mru_size', 0),
'mfu_size' : raw.get('mfu_size', 0),
'anon_size' : raw.get('anon_size', 0),
'metadata_size' : raw.get('arc_meta_used', 0),
'demand_hits' : raw.get('demand_data_hits', 0) + raw.get('demand_metadata_hits', 0),
'prefetch_hits' : raw.get('prefetch_data_hits', 0) + raw.get('prefetch_metadata_hits', 0),
'l2_hits' : raw.get('l2_hits', 0),
'l2_misses' : raw.get('l2_misses', 0),
'l2_size' : raw.get('l2_size', 0),
'l2_asize' : raw.get('l2_asize', 0),
}
# ── Background Worker ────────────────────────────────────────────────────────
def background_worker():
'''Collect all monitoring data on timed intervals and update the shared cache.'''
tick = 0
collect_iostat()
time.sleep(1)
while True:
try:
io_rates = collect_iostat()
arc = collect_arc_stats()
fast_temps = collect_temps_fast()
with lock:
for d in cache.get('disks', []):
name = d.get('name', '')
if name not in fast_temps and d.get('temperature') is not None:
fast_temps[name] = d['temperature']
cache['io_rates'] = io_rates
cache['arc'] = arc
cache['temps'] = fast_temps
if tick % (POOL_INTERVAL // IO_INTERVAL) == 0:
pools = collect_pools()
datasets, snapshots = collect_datasets_and_snapshots()
sys_info = collect_system_info()
with lock:
cache['pools'] = pools
cache['datasets'] = datasets
cache['snapshots'] = snapshots
cache['system_info'] = sys_info
if tick % (SMART_INTERVAL // IO_INTERVAL) == 0:
disks = collect_disks()
with lock:
cache['disks'] = disks
if not init_done.is_set():
init_done.set()
tick += 1
except Exception:
logging.exception('Worker error')
time.sleep(IO_INTERVAL)
# ── WebSocket Client ─────────────────────────────────────────────────────────
async def ws_sender(ws):
'''
Stream cache data to the dashboard over WebSocket.
:param ws: Active WebSocket connection to the dashboard
'''
with lock:
io_msg = json.dumps({'type': 'io', 'ts': time.time(), 'rates': cache['io_rates'], 'pool_map': cache['pool_map'], 'temps': cache['temps']})
arc_msg = json.dumps({'type': 'arc', 'ts': time.time(), 'arc': cache['arc']}) if cache['arc'] else None
pools_msg = json.dumps({'type': 'pools', 'ts': time.time(), 'pools': cache['pools']})
datasets_msg = json.dumps({'type': 'datasets', 'ts': time.time(), 'datasets': cache['datasets']})
snaps_msg = json.dumps({'type': 'snapshots', 'ts': time.time(), 'snapshots': cache['snapshots']})
disks_msg = json.dumps({'type': 'disks', 'ts': time.time(), 'disks': cache['disks']})
system_msg = json.dumps({'type': 'system', 'ts': time.time(), 'info': cache['system_info']})
await ws.send(system_msg)
await ws.send(pools_msg)
await ws.send(datasets_msg)
await ws.send(snaps_msg)
await ws.send(disks_msg)
await ws.send(io_msg)
if arc_msg:
await ws.send(arc_msg)
tick = 0
while True:
await asyncio.sleep(IO_INTERVAL)
tick += 1
with lock:
io_msg = json.dumps({'type': 'io', 'ts': time.time(), 'rates': cache['io_rates'], 'pool_map': cache['pool_map'], 'temps': cache['temps']})
await ws.send(io_msg)
with lock:
arc = cache['arc']
if arc:
await ws.send(json.dumps({'type': 'arc', 'ts': time.time(), 'arc': arc}))
if tick % (POOL_INTERVAL // IO_INTERVAL) == 0:
with lock:
pools_msg = json.dumps({'type': 'pools', 'ts': time.time(), 'pools': cache['pools']})
datasets_msg = json.dumps({'type': 'datasets', 'ts': time.time(), 'datasets': cache['datasets']})
snaps_msg = json.dumps({'type': 'snapshots', 'ts': time.time(), 'snapshots': cache['snapshots']})
system_msg = json.dumps({'type': 'system', 'ts': time.time(), 'info': cache['system_info']})
await ws.send(pools_msg)
await ws.send(datasets_msg)
await ws.send(snaps_msg)
await ws.send(system_msg)
if tick % (SMART_INTERVAL // IO_INTERVAL) == 0:
with lock:
disks_msg = json.dumps({'type': 'disks', 'ts': time.time(), 'disks': cache['disks']})
await ws.send(disks_msg)
async def ws_receiver(ws):
'''
Receive and execute commands from the dashboard.
:param ws: Active WebSocket connection to the dashboard
'''
async for raw in ws:
try:
data = json.loads(raw)
except json.JSONDecodeError:
continue
cmd = data.get('type')
if cmd == 'smarttest':
device = data.get('device', '')
test_type = data.get('test_type', 'short')
if test_type not in ('short', 'long', 'conveyance'):
continue
if not re.match(r'^/dev/(sd[a-z]+|nvme\d+n\d+|da\d+)$', device):
continue
out, err, rc = await asyncio.to_thread(run_cmd, ['smartctl', '-t', test_type, device])
await ws.send(json.dumps({
'type': 'smarttest_result', 'device': device,
'test_type': test_type, 'success': rc == 0,
'output': out.strip(),
}))
async def ws_main(dashboard_url: str):
'''
Connect to the dashboard and maintain the WebSocket link with auto-reconnect.
:param dashboard_url: WebSocket URL of the dashboard (e.g. ws://10.0.0.50:8888/ws/agent)
'''
while True:
try:
async with websockets.connect(dashboard_url, ping_interval=20, ping_timeout=10, max_size=2**22, close_timeout=5) as ws:
logging.info('Connected to dashboard at %s', dashboard_url)
await ws.send(json.dumps({
'type': 'hello',
'hostname': capabilities['hostname'],
'capabilities': capabilities,
}))
await asyncio.gather(ws_sender(ws), ws_receiver(ws))
except Exception as e:
logging.warning('WebSocket (%s: %s), reconnecting in 5s...', type(e).__name__, e)
await asyncio.sleep(5)
if __name__ == '__main__':
# Parse command line arguments
parser = argparse.ArgumentParser()
parser.add_argument('dashboard_url', help='Dashboard WebSocket URL (e.g. ws://10.0.0.50:8888/ws/agent)')
parser.add_argument('-d', '--debug', action='store_true', help='Enable debug logging')
args = parser.parse_args()
# Setup logging
if args.debug:
apv.setup_logging(level='DEBUG', log_to_disk=True, max_log_size=5*1024*1024, max_backups=5, compress_backups=True, log_file_name='havoc', show_details=True)
logging.debug('Debug logging enabled')
else:
apv.setup_logging(level='INFO')
detect_capabilities()
if os.geteuid() != 0:
raise RuntimeError('This program must be ran as root')
logging.info('ZPulse Agent starting — host: %s', capabilities['hostname'])
logging.info(' smartctl: %s', 'available' if capabilities['smartctl'] else 'NOT FOUND')
logging.info(' zfs: %s', 'available' if capabilities['zfs'] else 'NOT FOUND')
logging.info(' dashboard: %s', args.dashboard_url)
worker = threading.Thread(target=background_worker, daemon=True)
worker.start()
init_done.wait(timeout=120)
asyncio.run(ws_main(args.dashboard_url))

5
agent/requirements.txt Normal file
View File

@@ -0,0 +1,5 @@
# ZPulse - Developed by acidvegas in Python (https://github.com/acidvegas/rackwatch)
# zpulse/agent/requirements.txt
apv
websockets

53
agent/setup.sh Executable file
View File

@@ -0,0 +1,53 @@
#!/bin/bash
# ZPulse - Developed by acidvegas in Python (https://github.com/acidvegas/rackwatch)
# zpulse/agent/setup.sh
# Set trace, verbose, and exit on error
set -xev
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
INSTALL_DIR="/opt/zpulse-agent"
SERVICE_NAME="zpulse-agent"
# Check if running as root & an argument is provided
[ "$(id -u)" -ne 0 ] && { echo "Run as root: sudo $0 <dashboard_url>"; exit 1; }
[ -z "$1" ] && { echo "Usage: sudo $0 ws://DASHBOARD_IP:8888/ws/agent"; exit 1; }
# Set the dashboard URL
DASHBOARD_URL="$1"
# Install system packages
apt-get update -qq && apt-get install -y smartmontools zfsutils-linux python3-pip python3-venv
# Copy agent files to install directory
mkdir -p "$INSTALL_DIR"
cp "$SCRIPT_DIR/agent.py" "$INSTALL_DIR/"
cp "$SCRIPT_DIR/requirements.txt" "$INSTALL_DIR/"
# Create a Python virtual environment & install dependencies
python3 -m venv "$INSTALL_DIR/venv"
"$INSTALL_DIR/venv/bin/pip" install --quiet -r "$INSTALL_DIR/requirements.txt"
# Install the systemd service
cat > /etc/systemd/system/${SERVICE_NAME}.service <<EOF
[Unit]
Description=ZPulse Agent
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
ExecStart=$INSTALL_DIR/venv/bin/python $INSTALL_DIR/agent.py $DASHBOARD_URL
Restart=on-failure
RestartSec=5
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
EOF
# Reload the systemd daemon & enable & start the service
systemctl daemon-reload && systemctl enable ${SERVICE_NAME} && systemctl start ${SERVICE_NAME}
echo "ZPulse Agent installed to $INSTALL_DIR and running!"

596
dashboard/dashboard.py Normal file
View File

@@ -0,0 +1,596 @@
#!/usr/bin/env python3
# ZPulse - Developed by acidvegas in Python (https://github.com/acidvegas/rackwatch)
# zpulse/dashboard/dashboard.py
import argparse
import asyncio
import json
import logging
import os
import tempfile
import time
from collections import deque
from pathlib import Path
try:
import aiohttp.web
except ImportError:
raise ImportError('missing aiohttp module (pip install aiohttp)')
try:
import apv
except ImportError:
raise ImportError('missing apv module (pip install apv)')
# ── Configuration ────────────────────────────────────────────────────────────
BASE_DIR = Path(__file__).parent
SETTINGS_FILE = BASE_DIR / 'settings.json'
HISTORY_SIZE = 720
DEFAULT_SETTINGS = {
'gotify_url' : '',
'gotify_token' : '',
'alert_temp_warning' : 45,
'alert_temp_critical' : 55,
'alert_space_warning' : 80,
'alert_space_critical' : 90,
'alert_smart_enabled' : True,
'alert_pool_enabled' : True,
'alert_cooldown' : 3600,
}
CRITICAL_SMART_ATTRS = {5, 10, 187, 188, 196, 197, 198, 199}
settings = dict(DEFAULT_SETTINGS)
# ── Per-Agent State ──────────────────────────────────────────────────────────
class AgentState:
__slots__ = ('hostname', 'ws', 'online', 'last_seen', 'capabilities', 'current', 'history', 'alerts_active', 'alert_log', 'alert_cooldowns')
def __init__(self, hostname: str, ws):
'''Initialize state for a newly connected agent.
:param hostname: Agent's hostname
:param ws: WebSocket connection to the agent
'''
self.hostname = hostname
self.ws = ws
self.online = True
self.last_seen = time.time()
self.capabilities = {}
self.current = {}
self.history = {
'timestamps' : deque(maxlen=HISTORY_SIZE),
'io' : {},
'arc_size' : deque(maxlen=HISTORY_SIZE),
'arc_hit_rate' : deque(maxlen=HISTORY_SIZE),
'temps' : {},
}
self.alerts_active = []
self.alert_log = deque(maxlen=200)
self.alert_cooldowns = {}
agents = {} # hostname -> AgentState
browser_subs = {} # ws -> subscribed hostname (or None)
gotify_session = None
# ── Settings ─────────────────────────────────────────────────────────────────
def load_settings():
'''Load settings from disk, merging with defaults for any missing keys.'''
global settings
if SETTINGS_FILE.exists():
try:
with open(SETTINGS_FILE) as f:
saved = json.load(f)
merged = dict(DEFAULT_SETTINGS)
merged.update(saved)
settings = merged
except Exception as e:
logging.warning('Failed to load settings: %s', e)
def save_settings():
'''Atomically write current settings to disk using a temp file and rename.'''
try:
fd, tmp = tempfile.mkstemp(dir=str(BASE_DIR), suffix='.json.tmp')
with os.fdopen(fd, 'w') as f:
json.dump(settings, f, indent=2)
os.replace(tmp, str(SETTINGS_FILE))
except Exception as e:
logging.error('Failed to save settings: %s', e)
# ── Gotify ───────────────────────────────────────────────────────────────────
async def send_gotify(title: str, message: str, priority: int = 5):
'''Send a push notification via the configured Gotify server.
:param title: Notification title
:param message: Notification body
:param priority: Gotify priority level (default 5)
'''
global gotify_session
url = settings.get('gotify_url', '').rstrip('/')
token = settings.get('gotify_token', '')
if not url or not token:
return False
try:
if gotify_session is None:
gotify_session = aiohttp.ClientSession()
async with gotify_session.post(
f'{url}/message?token={token}',
json={'title': title, 'message': message, 'priority': priority},
timeout=aiohttp.ClientTimeout(total=10),
) as resp:
return resp.status == 200
except Exception as e:
logging.warning('Gotify failed: %s', e)
return False
# ── Alerts ───────────────────────────────────────────────────────────────────
def _should_alert(agent: 'AgentState', alert_type: str, target: str):
'''Check if an alert should fire based on cooldown period.
:param agent: Agent state to check cooldowns against
:param alert_type: Category of alert (e.g. disk_temp_crit, pool_health)
:param target: Specific target name (e.g. sda, tank)
'''
key = f'{alert_type}:{target}'
now = time.time()
cooldown = settings.get('alert_cooldown', 3600)
if now - agent.alert_cooldowns.get(key, 0) < cooldown:
return False
agent.alert_cooldowns[key] = now
return True
async def _emit_alert(agent: 'AgentState', alert_type: str, severity: str, target: str, message: str):
'''Log an alert and send a Gotify notification.
:param agent: Agent state to append the alert to
:param alert_type: Category of alert
:param severity: One of info, warning, critical
:param target: Specific target name (e.g. sda, tank)
:param message: Human-readable alert message
'''
entry = {'type': alert_type, 'severity': severity, 'target': target,
'message': message, 'timestamp': time.time()}
agent.alert_log.appendleft(entry)
priority = {'info': 3, 'warning': 5, 'critical': 8}.get(severity, 5)
asyncio.create_task(send_gotify(f'[{severity.upper()}] {agent.hostname}/{target}', message, priority))
async def check_alerts(agent: 'AgentState'):
'''Evaluate disk and pool data against alert thresholds and emit notifications.
:param agent: Agent state containing current disk and pool data
'''
active = []
disks_msg = agent.current.get('disks', {})
disks = disks_msg.get('disks', []) if isinstance(disks_msg, dict) else []
pools_msg = agent.current.get('pools', {})
pools = pools_msg.get('pools', []) if isinstance(pools_msg, dict) else []
if settings.get('alert_smart_enabled', True):
for d in disks:
name = d.get('name', '')
temp = d.get('temperature')
if temp is not None:
if temp >= settings.get('alert_temp_critical', 55):
a = {'type': 'disk_temp', 'severity': 'critical', 'target': name, 'message': f'Disk {name} temperature is {temp}°C (critical)'}
active.append(a)
if _should_alert(agent, 'disk_temp_crit', name):
await _emit_alert(agent, 'disk_temp', 'critical', name, a['message'])
elif temp >= settings.get('alert_temp_warning', 45):
a = {'type': 'disk_temp', 'severity': 'warning', 'target': name, 'message': f'Disk {name} temperature is {temp}°C (warning)'}
active.append(a)
if _should_alert(agent, 'disk_temp_warn', name):
await _emit_alert(agent, 'disk_temp', 'warning', name, a['message'])
if d.get('health') is False:
a = {'type': 'disk_health', 'severity': 'critical', 'target': name, 'message': f'Disk {name} SMART health check FAILED'}
active.append(a)
if _should_alert(agent, 'disk_health', name):
await _emit_alert(agent, 'disk_health', 'critical', name, a['message'])
for attr in d.get('smart_attributes', []):
if attr.get('id') in CRITICAL_SMART_ATTRS and attr.get('when_failed'):
a = {'type': 'smart_attr', 'severity': 'warning', 'target': name, 'message': f'Disk {name}: {attr["name"]} failing'}
active.append(a)
if _should_alert(agent, f'smart_attr_{attr["id"]}', name):
await _emit_alert(agent, 'smart_attr', 'warning', name, a['message'])
if settings.get('alert_pool_enabled', True):
for p in pools:
pname = p.get('name', '')
if p.get('health') not in ('ONLINE', ''):
sev = 'critical' if p['health'] == 'FAULTED' else 'warning'
a = {'type': 'pool_health', 'severity': sev, 'target': pname, 'message': f'Pool {pname} is {p["health"]}'}
active.append(a)
if _should_alert(agent, 'pool_health', pname):
await _emit_alert(agent, 'pool_health', sev, pname, a['message'])
cap = p.get('capacity_pct', 0)
if cap >= settings.get('alert_space_critical', 90):
a = {'type': 'pool_space', 'severity': 'critical', 'target': pname, 'message': f'Pool {pname} is {cap}% full'}
active.append(a)
if _should_alert(agent, 'pool_space_crit', pname):
await _emit_alert(agent, 'pool_space', 'critical', pname, a['message'])
elif cap >= settings.get('alert_space_warning', 80):
a = {'type': 'pool_space', 'severity': 'warning', 'target': pname, 'message': f'Pool {pname} is {cap}% full'}
active.append(a)
if _should_alert(agent, 'pool_space_warn', pname):
await _emit_alert(agent, 'pool_space', 'warning', pname, a['message'])
agent.alerts_active = active
# ── History ──────────────────────────────────────────────────────────────────
def update_history(agent: 'AgentState', data: dict):
'''Append incoming I/O, temperature, and ARC data to the agent's rolling history.
:param agent: Agent state containing history deques
:param data: Incoming message dict from the agent
'''
msg_type = data['type']
if msg_type == 'io':
ts = data.get('ts', time.time())
agent.history['timestamps'].append(ts)
for dname, rates in data.get('rates', {}).items():
if dname not in agent.history['io']:
agent.history['io'][dname] = {
'read_bps' : deque(maxlen=HISTORY_SIZE),
'write_bps' : deque(maxlen=HISTORY_SIZE),
'read_iops' : deque(maxlen=HISTORY_SIZE),
'write_iops': deque(maxlen=HISTORY_SIZE),
}
h = agent.history['io'][dname]
h['read_bps'].append(rates.get('read_bps', 0))
h['write_bps'].append(rates.get('write_bps', 0))
h['read_iops'].append(rates.get('read_iops', 0))
h['write_iops'].append(rates.get('write_iops', 0))
for dname, temp in data.get('temps', {}).items():
if dname not in agent.history['temps']:
agent.history['temps'][dname] = deque(maxlen=HISTORY_SIZE)
agent.history['temps'][dname].append(temp)
arc_msg = agent.current.get('arc', {})
arc = arc_msg.get('arc') if isinstance(arc_msg, dict) else None
if arc:
agent.history['arc_size'].append(arc.get('size', 0))
agent.history['arc_hit_rate'].append(arc.get('hit_rate', 0))
elif msg_type == 'arc':
pass
def serialize_history(h: dict):
'''Convert history deques to plain lists for JSON serialization.
:param h: History dict containing deques
'''
return {
'timestamps' : list(h['timestamps']),
'io' : {dn: {k: list(v) for k, v in s.items()} for dn, s in h['io'].items()},
'arc_size' : list(h['arc_size']),
'arc_hit_rate' : list(h['arc_hit_rate']),
'temps' : {dn: list(v) for dn, v in h['temps'].items()},
}
# ── Server List ──────────────────────────────────────────────────────────────
def get_server_list():
'''Build a summary list of all known agents for the fleet overview.'''
out = []
for hn, a in agents.items():
disks_msg = a.current.get('disks', {})
disks = disks_msg.get('disks', []) if isinstance(disks_msg, dict) else []
pools_msg = a.current.get('pools', {})
pools = pools_msg.get('pools', []) if isinstance(pools_msg, dict) else []
sys_msg = a.current.get('system', {})
sys_info = sys_msg.get('info', {}) if isinstance(sys_msg, dict) else {}
try:
with open('/proc/uptime') as f:
pass
except Exception:
pass
out.append({
'hostname' : hn,
'online' : a.online,
'last_seen' : a.last_seen,
'disk_count' : len(disks),
'pool_count' : len(pools),
'alert_count' : len(a.alerts_active),
'total_raw' : sum(d.get('size', 0) for d in disks),
'total_usable' : sum(p.get('size', 0) for p in pools),
'total_used' : sum(p.get('allocated', 0) for p in pools),
'cpu_model' : sys_info.get('cpu_model', ''),
'uptime_seconds' : sys_info.get('uptime_seconds', 0),
})
return out
# ── WebSocket: Agents ────────────────────────────────────────────────────────
async def agent_ws_handler(request: aiohttp.web.Request):
'''Handle WebSocket connections from monitoring agents.
:param request: Incoming aiohttp request
'''
ws = aiohttp.web.WebSocketResponse(heartbeat=30, max_msg_size=4 * 1024 * 1024)
await ws.prepare(request)
hostname = None
try:
async for msg in ws:
if msg.type == aiohttp.WSMsgType.TEXT:
try:
data = json.loads(msg.data)
except json.JSONDecodeError:
continue
if data.get('type') == 'hello':
hostname = data.get('hostname', 'unknown')
if hostname in agents:
agents[hostname].ws = ws
agents[hostname].online = True
agents[hostname].last_seen = time.time()
agents[hostname].capabilities = data.get('capabilities', {})
else:
agents[hostname] = AgentState(hostname, ws)
agents[hostname].capabilities = data.get('capabilities', {})
logging.info('Agent connected: %s', hostname)
await broadcast_server_list()
continue
if data.get('type') == 'smarttest_result' and hostname:
await forward_to_browsers(hostname, data)
continue
if hostname and hostname in agents:
a = agents[hostname]
a.last_seen = time.time()
a.current[data['type']] = data
update_history(a, data)
if data['type'] in ('disks', 'pools'):
await check_alerts(a)
await forward_to_browsers(hostname, data)
elif msg.type in (aiohttp.WSMsgType.ERROR, aiohttp.WSMsgType.CLOSE):
break
finally:
if hostname and hostname in agents:
agents[hostname].online = False
agents[hostname].ws = None
logging.info('Agent disconnected: %s', hostname)
await broadcast_server_list()
return ws
# ── WebSocket: Browsers ─────────────────────────────────────────────────────
async def browser_ws_handler(request: aiohttp.web.Request):
'''Handle WebSocket connections from browser clients.
:param request: Incoming aiohttp request
'''
ws = aiohttp.web.WebSocketResponse(heartbeat=30)
await ws.prepare(request)
browser_subs[ws] = None
try:
await ws.send_json({'type': 'servers', 'servers': get_server_list()})
await ws.send_json({'type': 'settings', 'settings': settings})
async for msg in ws:
if msg.type == aiohttp.WSMsgType.TEXT:
try:
data = json.loads(msg.data)
except json.JSONDecodeError:
continue
cmd = data.get('type')
if cmd == 'subscribe':
hn = data.get('hostname')
browser_subs[ws] = hn
if hn and hn in agents:
a = agents[hn]
await ws.send_json({
'type' : 'full_state',
'hostname': hn,
'online' : a.online,
'current' : a.current,
'history' : serialize_history(a.history),
'alerts' : {'active': a.alerts_active, 'log': list(a.alert_log)},
})
else:
await ws.send_json({
'type': 'full_state', 'hostname': hn or '',
'online': False, 'current': {}, 'history': {},
'alerts': {'active': [], 'log': []},
})
elif cmd == 'smarttest':
hn = data.get('hostname')
if hn and hn in agents and agents[hn].online and agents[hn].ws:
try:
await agents[hn].ws.send_str(json.dumps({
'type': 'smarttest',
'device': data.get('device', ''),
'test_type': data.get('test_type', 'short'),
}))
except Exception:
pass
elif cmd == 'save_settings':
new = data.get('settings', {})
for key in DEFAULT_SETTINGS:
if key in new:
expected = type(DEFAULT_SETTINGS[key])
try:
settings[key] = expected(new[key])
except (ValueError, TypeError):
pass
await asyncio.to_thread(save_settings)
for bws in list(browser_subs.keys()):
try:
await bws.send_json({'type': 'settings', 'settings': settings})
except Exception:
pass
elif cmd == 'test_notification':
ok = await send_gotify('ZPulse Test', 'Test notification from ZPulse.', 5)
await ws.send_json({'type': 'test_notification_result', 'success': ok})
elif msg.type in (aiohttp.WSMsgType.ERROR, aiohttp.WSMsgType.CLOSE):
break
finally:
browser_subs.pop(ws, None)
return ws
# ── Broadcast / Forward ─────────────────────────────────────────────────────
async def forward_to_browsers(hostname: str, data: dict):
'''Forward an agent message to all browsers subscribed to that agent.
:param hostname: Agent hostname the data came from
:param data: Message dict to forward
'''
data_out = dict(data)
data_out['hostname'] = hostname
if data['type'] in ('disks', 'pools'):
a = agents.get(hostname)
if a:
alert_msg = json.dumps({
'type': 'alerts', 'hostname': hostname,
'active': a.alerts_active, 'log': list(a.alert_log),
})
for bws, sub_hn in list(browser_subs.items()):
if sub_hn == hostname:
try:
await bws.send_str(alert_msg)
except Exception:
pass
msg = json.dumps(data_out)
for bws, sub_hn in list(browser_subs.items()):
if sub_hn == hostname:
try:
await bws.send_str(msg)
except Exception:
pass
async def broadcast_server_list():
'''Send the current server list to all connected browsers.'''
servers = get_server_list()
msg = json.dumps({'type': 'servers', 'servers': servers})
for bws in list(browser_subs.keys()):
try:
await bws.send_str(msg)
except Exception:
pass
async def periodic_broadcast(_app=None):
'''Refresh the server list for all browsers every 30 seconds.'''
while True:
await asyncio.sleep(30)
await broadcast_server_list()
# ── HTTP Routes ──────────────────────────────────────────────────────────────
async def index_handler(request: aiohttp.web.Request):
'''Serve the main dashboard HTML page.
:param request: Incoming aiohttp request
'''
return aiohttp.web.FileResponse(BASE_DIR / 'templates' / 'index.html')
# ── Lifecycle ────────────────────────────────────────────────────────────────
async def on_startup(app: aiohttp.web.Application):
'''Start background tasks when the server starts.
:param app: aiohttp application instance
'''
app['periodic_task'] = asyncio.create_task(periodic_broadcast())
async def on_shutdown(app: aiohttp.web.Application):
'''Cancel background tasks and close HTTP sessions on shutdown.
:param app: aiohttp application instance
'''
global gotify_session
app['periodic_task'].cancel()
if gotify_session:
await gotify_session.close()
if __name__ == '__main__':
# Parse command line arguments
parser = argparse.ArgumentParser(description='ZPulse Dashboard')
parser.add_argument('--host', default='0.0.0.0', help='Listen address (default: 0.0.0.0)')
parser.add_argument('--port', type=int, default=8888, help='Listen port (default: 8888)')
parser.add_argument('-d', '--debug', action='store_true', help='Enable debug logging')
args = parser.parse_args()
# Setup logging
if args.debug:
apv.setup_logging(level='DEBUG', log_to_disk=True, max_log_size=5*1024*1024, max_backups=5, compress_backups=True, log_file_name='zpulse-dashboard', show_details=True)
logging.debug('Debug logging enabled')
else:
apv.setup_logging(level='INFO')
load_settings()
logging.info('ZPulse Dashboard starting on http://%s:%d', args.host, args.port)
app = aiohttp.web.Application()
app.router.add_get('/', index_handler)
app.router.add_get('/ws/agent', agent_ws_handler)
app.router.add_get('/ws', browser_ws_handler)
app.router.add_static('/static', str(BASE_DIR / 'static'))
app.on_startup.append(on_startup)
app.on_shutdown.append(on_shutdown)
aiohttp.web.run_app(app, host=args.host, port=args.port, print=lambda s: logging.info(s))

View File

@@ -0,0 +1,5 @@
# ZPulse - Developed by acidvegas in Python (https://github.com/acidvegas/rackwatch)
# zpulse/dashboard/requirements.txt
aiohttp
apv

61
dashboard/setup.sh Executable file
View File

@@ -0,0 +1,61 @@
#!/bin/sh
# ZPulse - Developed by acidvegas in Python (https://github.com/acidvegas/rackwatch)
# zpulse/dashboard/setup.sh
# Set trace, verbose, and exit on error
set -xev
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
INSTALL_DIR="/opt/zpulse-dashboard"
SERVICE_NAME="zpulse-dashboard"
# Check if running as root
[ "$(id -u)" -ne 0 ] && { echo "Run as root: sudo $0"; exit 1; }
# Install system packages
apt-get update -qq && apt-get install -y python3-pip python3-venv curl
# Copy dashboard files to install directory
mkdir -p "$INSTALL_DIR/templates" "$INSTALL_DIR/static"
cp "$SCRIPT_DIR/dashboard.py" "$INSTALL_DIR/"
cp "$SCRIPT_DIR/requirements.txt" "$INSTALL_DIR/"
cp "$SCRIPT_DIR/templates/index.html" "$INSTALL_DIR/templates/"
# Fetch Chart.js
if [ ! -f "$INSTALL_DIR/static/chart.min.js" ]; then
curl -sL "https://cdn.jsdelivr.net/npm/chart.js@4/dist/chart.umd.min.js" -o "$INSTALL_DIR/static/chart.min.js"
fi
# Create a Python virtual environment & install dependencies
python3 -m venv "$INSTALL_DIR/venv"
"$INSTALL_DIR/venv/bin/pip" install --quiet -r "$INSTALL_DIR/requirements.txt"
# Install the systemd service
cat > /etc/systemd/system/${SERVICE_NAME}.service <<EOF
[Unit]
Description=ZPulse Dashboard
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
WorkingDirectory=$INSTALL_DIR
ExecStart=$INSTALL_DIR/venv/bin/python $INSTALL_DIR/dashboard.py
Restart=on-failure
RestartSec=5
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
EOF
# Reload the systemd daemon & enable & start the service
systemctl daemon-reload && systemctl enable ${SERVICE_NAME} && systemctl start ${SERVICE_NAME}
echo "ZPulse Dashboard installed to $INSTALL_DIR and running!"
echo " Open: http://$(hostname -I | awk '{print $1}'):8888"
echo " Status: systemctl status ${SERVICE_NAME}"
echo " Logs: journalctl -u ${SERVICE_NAME} -f"
echo " Stop: systemctl stop ${SERVICE_NAME}"
echo " Restart: systemctl restart ${SERVICE_NAME}"

14
dashboard/static/chart.min.js vendored Normal file

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,614 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>ZPulse</title>
<link rel="icon" href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 32 32'><rect width='32' height='32' rx='6' fill='%233b82f6'/><text x='16' y='23' text-anchor='middle' fill='white' font-size='20' font-weight='bold' font-family='sans-serif'>Z</text></svg>">
<style>
:root{--bg:#06090f;--surface:#0d1420;--surface2:#131c2e;--surface3:#192436;--border:#1e2d44;--text:#d4dae6;--text2:#7a879e;--accent:#3b82f6;--accent2:#2563eb;--green:#22c55e;--yellow:#eab308;--red:#ef4444;--orange:#f97316;--cyan:#06b6d4;--purple:#a855f7;--radius:8px;--font:-apple-system,BlinkMacSystemFont,'Segoe UI',Roboto,sans-serif;--mono:'SF Mono','Cascadia Code','Fira Code',monospace}
*,*::before,*::after{box-sizing:border-box;margin:0;padding:0}
html{scroll-behavior:smooth;scrollbar-color:var(--surface3) var(--bg)}
body{font-family:var(--font);background:var(--bg);color:var(--text);min-height:100vh;line-height:1.5}
a{color:var(--accent);text-decoration:none}
nav{position:sticky;top:0;z-index:100;background:rgba(13,20,32,.85);border-bottom:1px solid var(--border);padding:0 1.5rem;display:flex;align-items:center;height:46px;gap:1rem;backdrop-filter:blur(16px)}
nav .logo{font-weight:700;font-size:.95rem;color:var(--accent);white-space:nowrap}
nav .conn{width:7px;height:7px;border-radius:50%;background:var(--red);flex-shrink:0;transition:background .3s}
nav .nav-links{display:flex;gap:.1rem;margin-left:auto}
nav .nav-links a{padding:.3rem .55rem;border-radius:6px;font-size:.75rem;color:var(--text2);transition:all .15s}
nav .nav-links a:hover{color:var(--text);background:var(--surface2)}
.btn{padding:.35rem .8rem;border-radius:6px;font-size:.75rem;background:var(--accent);color:#fff;border:none;cursor:pointer;font-family:var(--font);transition:background .15s}
.btn:hover{background:var(--accent2)}
.btn-ghost{background:transparent;border:1px solid var(--border);color:var(--text2)}
.btn-ghost:hover{border-color:var(--accent);color:var(--accent);background:transparent}
.btn-sm{padding:.22rem .55rem;font-size:.7rem}
.server-select{background:var(--surface2);border:1px solid var(--border);border-radius:6px;color:var(--text);font-family:var(--mono);font-size:.75rem;padding:.25rem .5rem;outline:none;cursor:pointer;max-width:160px}
.server-select:focus{border-color:var(--accent)}
main{max-width:1440px;margin:0 auto;padding:1rem 1.5rem}
section{margin-bottom:1.5rem}
.section-title{font-size:.9rem;font-weight:600;margin-bottom:.65rem;padding-bottom:.35rem;border-bottom:1px solid var(--border);display:flex;align-items:center;gap:.5rem}
.section-title .badge{font-size:.62rem;padding:.1rem .4rem;border-radius:99px;font-weight:500}
.section-sub{font-size:.72rem;color:var(--text2);margin-top:-.35rem;margin-bottom:.65rem}
.grid{display:grid;gap:.65rem}
.g4{grid-template-columns:repeat(4,1fr)}.g3{grid-template-columns:repeat(3,1fr)}.g2{grid-template-columns:repeat(2,1fr)}
.card{background:var(--surface);border:1px solid var(--border);border-radius:var(--radius);padding:.9rem 1rem}
.card-label{font-size:.65rem;color:var(--text2);text-transform:uppercase;letter-spacing:.06em;margin-bottom:.15rem}
.card-value{font-size:1.4rem;font-weight:700;font-family:var(--mono);letter-spacing:-.03em}
.card-sub{font-size:.7rem;color:var(--text2);margin-top:.1rem}
.stat-bar{height:4px;border-radius:2px;background:var(--surface3);margin-top:.35rem;overflow:hidden}
.stat-bar-fill{height:100%;border-radius:2px;transition:width .5s}
.fleet-card{cursor:pointer;transition:border-color .2s,transform .15s}
.fleet-card:hover{border-color:var(--accent);transform:translateY(-2px)}
.fleet-card.offline{opacity:.6}
.pool-header{display:flex;justify-content:space-between;align-items:center;margin-bottom:.6rem}
.pool-name{font-size:.95rem;font-weight:600;font-family:var(--mono)}
.hb{font-size:.65rem;padding:.12rem .45rem;border-radius:99px;font-weight:600;text-transform:uppercase;letter-spacing:.04em}
.h-on{background:rgba(34,197,94,.12);color:var(--green)}.h-deg{background:rgba(234,179,8,.12);color:var(--yellow)}.h-flt{background:rgba(239,68,68,.12);color:var(--red)}.h-unk{background:rgba(122,135,158,.12);color:var(--text2)}
.pool-stats{display:grid;grid-template-columns:repeat(auto-fit,minmax(85px,1fr));gap:.4rem;margin-bottom:.6rem}
.ps-l{font-size:.6rem;color:var(--text2);text-transform:uppercase;letter-spacing:.04em}.ps-v{font-size:.85rem;font-weight:600;font-family:var(--mono)}
.vdev-tree{font-family:var(--mono);font-size:.72rem;margin-top:.4rem}
.vr{display:grid;grid-template-columns:1fr 68px 48px 48px 48px;padding:.2rem .35rem;border-radius:3px;align-items:center}
.vr:nth-child(even){background:var(--surface2)}.vr.vh{color:var(--text2);font-weight:600;font-size:.62rem;text-transform:uppercase;letter-spacing:.04em}
.vr .zero{color:var(--surface3)}.vr .err{color:var(--red);font-weight:600}
.chart-container{position:relative;height:195px;width:100%}
.chart-unavail{display:flex;align-items:center;justify-content:center;height:195px;color:var(--text2);font-size:.8rem}
.disk-header{display:flex;justify-content:space-between;align-items:flex-start;margin-bottom:.5rem}
.disk-name{font-size:.95rem;font-weight:600;font-family:var(--mono)}
.dtb{font-size:.62rem;padding:.1rem .4rem;border-radius:99px;font-weight:600;text-transform:uppercase;letter-spacing:.04em;background:rgba(59,130,246,.12);color:var(--accent)}
.dtb-sas{background:rgba(168,85,247,.12);color:var(--purple)}.dtb-nv{background:rgba(6,182,212,.12);color:var(--cyan)}
.disk-stats{display:grid;grid-template-columns:repeat(4,1fr);gap:.35rem;margin-bottom:.5rem}
.ds{padding:.35rem .45rem;background:var(--surface2);border-radius:6px}
.ds-l{font-size:.58rem;color:var(--text2);text-transform:uppercase}.ds-v{font-size:.8rem;font-weight:600;font-family:var(--mono)}
.score-badge{font-size:.65rem;padding:.12rem .4rem;border-radius:99px;font-weight:700;font-family:var(--mono)}
.tbl{width:100%;font-size:.75rem;border-collapse:collapse;table-layout:fixed}
.tbl th{text-align:left;padding:.35rem .5rem;color:var(--text2);border-bottom:1px solid var(--border);font-weight:600;font-size:.65rem;text-transform:uppercase;letter-spacing:.04em;white-space:nowrap;overflow:hidden}
.tbl td{padding:.3rem .5rem;border-bottom:1px solid var(--surface3);font-family:var(--mono);font-size:.72rem;white-space:nowrap;overflow:hidden;text-overflow:ellipsis}
.tbl tr:hover td{background:var(--surface2)}.tbl th.r,.tbl td.r{text-align:right}
.tbl th.sort{cursor:pointer;user-select:none}.tbl th.sort:hover{color:var(--text)}
.io-t th:nth-child(1){width:8%}.io-t th:nth-child(2),.io-t th:nth-child(3){width:12%}.io-t th:nth-child(4),.io-t th:nth-child(5){width:10%}.io-t th:nth-child(6),.io-t th:nth-child(7){width:9%}.io-t th:nth-child(8){width:8%}.io-t th:nth-child(9){width:12%}
.alert-list{display:flex;flex-direction:column;gap:.35rem}
.alert-item{display:flex;align-items:center;gap:.5rem;padding:.4rem .65rem;border-radius:6px;font-size:.75rem;border-left:3px solid}
.alert-warning{background:rgba(234,179,8,.06);border-color:var(--yellow)}.alert-critical{background:rgba(239,68,68,.06);border-color:var(--red)}.alert-info{background:rgba(59,130,246,.06);border-color:var(--accent)}
.alert-time{font-size:.65rem;color:var(--text2);font-family:var(--mono);white-space:nowrap}
.modal-overlay{display:none;position:fixed;inset:0;z-index:200;background:rgba(0,0,0,.6);backdrop-filter:blur(4px);align-items:center;justify-content:center}
.modal-overlay.open{display:flex}
.modal{background:var(--surface);border:1px solid var(--border);border-radius:12px;padding:1.5rem;width:90%;max-width:520px;max-height:85vh;overflow-y:auto}
.modal h2{font-size:1rem;margin-bottom:1rem}
.modal label{display:block;font-size:.72rem;color:var(--text2);margin-bottom:.2rem;margin-top:.75rem}.modal label:first-of-type{margin-top:0}
.modal input{width:100%;padding:.45rem .65rem;border-radius:6px;background:var(--surface2);border:1px solid var(--border);color:var(--text);font-family:var(--mono);font-size:.8rem;outline:none}.modal input:focus{border-color:var(--accent)}
.modal .toggle-row{display:flex;align-items:center;justify-content:space-between;margin-top:.75rem}.modal .toggle-row label{margin:0}
.toggle{position:relative;width:36px;height:19px;cursor:pointer}.toggle input{opacity:0;width:0;height:0}
.toggle .slider{position:absolute;inset:0;border-radius:10px;background:var(--surface3);transition:background .2s}
.toggle .slider::before{content:'';position:absolute;width:13px;height:13px;border-radius:50%;left:3px;top:3px;background:var(--text2);transition:all .2s}
.toggle input:checked+.slider{background:var(--accent)}.toggle input:checked+.slider::before{transform:translateX(17px);background:#fff}
.modal-actions{display:flex;gap:.5rem;margin-top:1rem}.modal-actions .btn{flex:1;text-align:center}
.smart-modal{max-width:820px}.smart-modal h2{display:flex;align-items:center;gap:.5rem;flex-wrap:wrap}
.si-grid{display:grid;grid-template-columns:repeat(auto-fit,minmax(125px,1fr));gap:.35rem;margin-bottom:.85rem}
.si-item{padding:.35rem .5rem;background:var(--surface2);border-radius:6px}
.si-l{font-size:.58rem;color:var(--text2);text-transform:uppercase;letter-spacing:.04em}.si-v{font-size:.8rem;font-weight:600;font-family:var(--mono)}
.st{width:100%;font-size:.7rem;font-family:var(--mono);border-collapse:collapse}
.st th{text-align:left;padding:.3rem .4rem;color:var(--text2);border-bottom:1px solid var(--border);font-weight:600;font-size:.62rem;text-transform:uppercase;letter-spacing:.04em;position:sticky;top:0;background:var(--surface)}
.st td{padding:.25rem .4rem;border-bottom:1px solid var(--surface3)}.st tr:hover td{background:var(--surface2)}
.st .attr-warn{color:var(--yellow)}.st .attr-crit{color:var(--red);font-weight:600}.st .attr-note{color:var(--text2);font-style:italic;font-size:.62rem}
.empty{text-align:center;padding:1.25rem;color:var(--text2);font-size:.82rem}
.tc{color:var(--green)}.tw{color:var(--yellow)}.th{color:var(--red)}
@media(max-width:1024px){.g4{grid-template-columns:repeat(2,1fr)}.g2{grid-template-columns:1fr}.pool-stats{grid-template-columns:repeat(3,1fr)}.disk-stats{grid-template-columns:repeat(2,1fr)}}
@media(max-width:640px){nav .nav-links{display:none}main{padding:.75rem}.g4,.g3{grid-template-columns:1fr 1fr}.smart-modal{max-width:98%;padding:1rem}}
</style>
</head>
<body>
<nav>
<div class="logo">ZPulse</div>
<select class="server-select" id="server-select" onchange="switchServer(this.value)">
<option value="">Fleet Overview</option>
</select>
<div class="conn" id="conn-dot" title="Disconnected"></div>
<div class="nav-links" id="detail-nav" style="display:none">
<a href="#overview">Overview</a><a href="#pools">Pools</a><a href="#io">I/O</a><a href="#arc">ARC</a><a href="#temps">Temp</a><a href="#disks-section">Disks</a><a href="#datasets">Datasets</a><a href="#alerts-section">Alerts</a>
</div>
<button class="btn btn-ghost btn-sm" onclick="openSettings()">Settings</button>
</nav>
<main>
<!-- Fleet Overview -->
<section id="fleet">
<div class="section-title">Server Fleet <span class="badge h-on" id="fleet-count">0</span></div>
<div class="grid g3" id="fleet-cards"><div class="empty">Waiting for agents to connect...</div></div>
</section>
<!-- Per-server detail (hidden until a server is selected) -->
<div id="server-detail" style="display:none">
<section id="overview">
<div class="grid g4" id="overview-cards"><div class="card"><div class="card-label">Loading...</div></div></div>
<div class="grid g4" id="system-cards" style="margin-top:.65rem"></div>
</section>
<section id="pools"><div class="section-title">ZFS Pools <span class="badge h-on" id="pool-count-badge"></span></div><div id="pool-cards"></div></section>
<section id="io">
<div class="section-title">I/O Performance</div>
<div class="grid g4" id="io-summary-cards"></div>
<div class="grid g2" style="margin-top:.65rem">
<div class="card"><div class="card-label">Throughput</div><div class="chart-container"><canvas id="chart-throughput"></canvas></div></div>
<div class="card"><div class="card-label">IOPS</div><div class="chart-container"><canvas id="chart-iops"></canvas></div></div>
</div>
<div class="card" style="margin-top:.65rem;overflow-x:auto">
<table class="tbl io-t"><thead><tr><th>Disk</th><th class="r">Read</th><th class="r">Write</th><th class="r">R IOPS</th><th class="r">W IOPS</th><th class="r">R Lat</th><th class="r">W Lat</th><th class="r">Busy</th><th>Pool</th></tr></thead><tbody id="io-disk-tbody"></tbody></table>
</div>
</section>
<section id="arc">
<div class="section-title">ARC Cache</div>
<div class="section-sub">Adaptive Replacement Cache — ZFS uses free RAM as a read cache. Hit rate shown is instantaneous (per sample interval).</div>
<div class="grid g4" id="arc-cards"></div>
<div class="grid g2" style="margin-top:.65rem">
<div class="card"><div class="card-label">Hit Rate</div><div class="chart-container"><canvas id="chart-arc-hitrate"></canvas></div></div>
<div class="card"><div class="card-label">ARC Size</div><div class="chart-container"><canvas id="chart-arc-size"></canvas></div></div>
</div>
</section>
<section id="temps">
<div class="section-title">Disk Temperatures</div>
<div class="card"><div class="chart-container" style="height:200px"><canvas id="chart-temps"></canvas></div></div>
</section>
<section id="disks-section"><div class="section-title">Physical Disks <span class="badge h-on" id="disk-count-badge"></span></div><div class="grid g2" id="disk-cards"></div></section>
<section id="datasets">
<div class="section-title">ZFS Datasets</div>
<div class="card" style="overflow-x:auto"><table class="tbl" id="ds-table"><thead><tr><th class="sort" data-sort="name">Name</th><th class="sort r" data-sort="used">Used</th><th class="sort r" data-sort="available">Avail</th><th class="sort r" data-sort="referenced">Refer</th><th class="sort" data-sort="compression">Comp</th><th class="sort r" data-sort="compressratio">Ratio</th><th class="sort" data-sort="mountpoint">Mount</th></tr></thead><tbody id="ds-tbody"></tbody></table></div>
</section>
<section id="snapshots">
<div class="section-title">Snapshots <span class="badge h-on" id="snap-count-badge">0</span></div>
<div class="card" style="overflow-x:auto"><table class="tbl"><thead><tr><th>Name</th><th class="r">Used</th><th class="r">Referenced</th><th>Created</th></tr></thead><tbody id="snap-tbody"></tbody></table></div>
</section>
<section id="alerts-section">
<div class="section-title">Alerts</div>
<div id="alert-active" class="alert-list" style="margin-bottom:.65rem"></div>
<div class="card"><div class="card-label">Log</div><div id="alert-log" class="alert-list" style="margin-top:.4rem"></div></div>
</section>
</div><!-- /server-detail -->
</main>
<div class="modal-overlay" id="settings-modal" onclick="if(event.target===this)closeSettings()">
<div class="modal"><h2>Settings</h2>
<label>Gotify Server URL</label><input type="url" id="set-gotify-url" placeholder="https://gotify.example.com">
<label>Gotify App Token</label><input type="text" id="set-gotify-token" placeholder="AbCdEf123456">
<label>Temp Warning (°C)</label><input type="number" id="set-temp-warn" min="20" max="80">
<label>Temp Critical (°C)</label><input type="number" id="set-temp-crit" min="30" max="90">
<label>Space Warning (%)</label><input type="number" id="set-space-warn" min="50" max="99">
<label>Space Critical (%)</label><input type="number" id="set-space-crit" min="60" max="99">
<label>Alert Cooldown (s)</label><input type="number" id="set-cooldown" min="60" max="86400">
<div class="toggle-row"><label>SMART Alerts</label><label class="toggle"><input type="checkbox" id="set-smart-alerts"><span class="slider"></span></label></div>
<div class="toggle-row"><label>Pool Alerts</label><label class="toggle"><input type="checkbox" id="set-pool-alerts"><span class="slider"></span></label></div>
<div class="modal-actions"><button class="btn btn-ghost" onclick="closeSettings()">Cancel</button><button class="btn btn-ghost" onclick="testNotification()">Test</button><button class="btn" onclick="saveSettings()">Save</button></div>
</div></div>
<div class="modal-overlay" id="smart-modal" onclick="if(event.target===this)closeSmartModal()">
<div class="modal smart-modal"><div id="smart-modal-content"></div><div class="modal-actions" style="margin-top:.65rem"><button class="btn btn-ghost" onclick="closeSmartModal()" style="flex:0 0 auto">Close</button></div></div>
</div>
<script src="/static/chart.min.js"></script>
<script>
const B=n=>{if(n==null||isNaN(n))return'—';const u=['B','KiB','MiB','GiB','TiB','PiB'];let i=0,v=Math.abs(n);while(v>=1024&&i<u.length-1){v/=1024;i++}return v.toFixed(i>0?2:0)+' '+u[i]};
const Bps=n=>{if(n==null||isNaN(n))return'—';const u=['B/s','KiB/s','MiB/s','GiB/s'];let i=0,v=Math.abs(n);while(v>=1024&&i<u.length-1){v/=1024;i++}return v.toFixed(2)+' '+u[i]};
const pct=v=>v!=null?v.toFixed(1)+'%':'—';
const hc=h=>{if(!h)return'h-unk';const s=String(h).toUpperCase();return s==='ONLINE'?'h-on':s==='DEGRADED'?'h-deg':s==='FAULTED'?'h-flt':'h-unk'};
const tc=t=>t==null?'':t<40?'tc':t<50?'tw':'th';
const fmtH=h=>{if(h==null)return'—';const y=Math.floor(h/8766),d=Math.floor((h%8766)/24);let s=h.toLocaleString()+'h';if(y>0)s+=` (${y}y ${d}d)`;else if(d>0)s+=` (${d}d)`;return s};
const fmtLat=ms=>ms>0?ms.toFixed(1)+'ms':'—';
const latCol=ms=>ms>50?'var(--red)':ms>20?'var(--yellow)':ms>0?'var(--text)':'var(--text2)';
const busyCol=p=>p>80?'var(--red)':p>40?'var(--yellow)':p>5?'var(--text)':'var(--text2)';
const scoreCol=s=>s>=80?'var(--green)':s>=50?'var(--yellow)':'var(--red)';
const scoreBg=s=>s>=80?'rgba(34,197,94,.12)':s>=50?'rgba(234,179,8,.12)':'rgba(239,68,68,.12)';
const relTime=ts=>{const s=Math.floor(Date.now()/1000-ts);if(s<60)return'just now';if(s<3600)return Math.floor(s/60)+'m ago';if(s<86400)return Math.floor(s/3600)+'h ago';return Math.floor(s/86400)+'d ago'};
const fmtUptime=sec=>{if(!sec)return'';const d=Math.floor(sec/86400),h=Math.floor((sec%86400)/3600);return d>0?d+'d '+h+'h':h+'h '+Math.floor((sec%3600)/60)+'m'};
const SEAGATE=new Set([1,7,195]);
const DISK_COLORS=['#3b82f6','#f97316','#22c55e','#a855f7','#06b6d4','#ef4444','#eab308','#ec4899','#14b8a6','#f43f5e','#8b5cf6','#84cc16'];
let ws=null, selectedServer=null, servers=[], settingsData={};
const state={disks:[],pools:[],datasets:[],snapshots:[],ioRates:{},poolMap:{},arc:null,alertsActive:[],alertLog:[]};
const charts={};
let chartsReady=false;
const POLL_IO=3000,CHART_WINDOW=3*60*1000;
/* ── Charts ──────────────────────────────────────────────────────────── */
const CHART_OPTS=()=>({responsive:true,maintainAspectRatio:false,
animation:{duration:POLL_IO,easing:'linear'},animations:{y:{duration:0}},
interaction:{mode:'nearest',axis:'x',intersect:false},
plugins:{legend:{display:true,position:'top',labels:{color:'#7a879e',boxWidth:10,boxHeight:2,font:{size:10}}},tooltip:{backgroundColor:'#131c2e',borderColor:'#1e2d44',borderWidth:1,titleColor:'#d4dae6',bodyColor:'#d4dae6',bodyFont:{family:"'SF Mono',monospace",size:10}}},
scales:{x:{type:'linear',ticks:{callback:v=>new Date(v).toLocaleTimeString([],{hour:'2-digit',minute:'2-digit',second:'2-digit'}),maxTicksLimit:6,color:'#7a879e',font:{size:9}},grid:{color:'#1e2d4418'}},y:{ticks:{color:'#7a879e',font:{size:9},maxTicksLimit:6},grid:{color:'#1e2d4418'},beginAtZero:true}}});
function mkChart(id,l1,l2,c1,c2){const ctx=document.getElementById(id);if(!ctx)return null;return new Chart(ctx,{type:'line',data:{datasets:[{label:l1,data:[],borderColor:c1,backgroundColor:c1+'15',borderWidth:1.5,pointRadius:0,fill:true,tension:.35},{label:l2,data:[],borderColor:c2,backgroundColor:c2+'15',borderWidth:1.5,pointRadius:0,fill:true,tension:.35}]},options:CHART_OPTS()})}
function initCharts(){
if(typeof Chart==='undefined'){document.querySelectorAll('.chart-container').forEach(el=>{el.innerHTML='<div class="chart-unavail">Chart library unavailable</div>'});return}
try{
charts.tp=mkChart('chart-throughput','Read','Write','#3b82f6','#f97316');
charts.iops=mkChart('chart-iops','Read','Write','#06b6d4','#a855f7');
charts.arcH=mkChart('chart-arc-hitrate','Hits %','Misses %','#22c55e','#ef4444');
charts.arcS=mkChart('chart-arc-size','Size','Target','#3b82f6','#7a879e');
if(charts.tp){charts.tp.options.scales.y.ticks.callback=Bps;charts.tp.options.plugins.tooltip.callbacks={label:c=>c.dataset.label+': '+Bps(c.parsed.y)}}
if(charts.arcH){charts.arcH.options.scales.y.max=100;charts.arcH.options.scales.y.min=0}
if(charts.arcS){charts.arcS.options.scales.y.ticks.callback=B;charts.arcS.options.plugins.tooltip.callbacks={label:c=>c.dataset.label+': '+B(c.parsed.y)}}
chartsReady=true;
}catch(e){console.error('Chart init:',e)}
}
function destroyCharts(){for(const k of Object.keys(charts)){if(charts[k]){charts[k].destroy();delete charts[k]}}chartsReady=false}
function initTempChart(names){
if(typeof Chart==='undefined'||charts.temps)return;
const ctx=document.getElementById('chart-temps');if(!ctx)return;
const opts=CHART_OPTS();opts.scales.y.ticks.callback=v=>v+'°C';opts.plugins.tooltip.callbacks={label:c=>c.dataset.label+': '+c.parsed.y+'°C'};
charts.temps=new Chart(ctx,{type:'line',data:{datasets:names.map((n,i)=>({label:n,data:[],borderColor:DISK_COLORS[i%DISK_COLORS.length],borderWidth:1.5,pointRadius:0,tension:.35,fill:false}))},options:opts});
}
function pushCh(ch,ts,vals){if(!ch)return;vals.forEach((v,i)=>ch.data.datasets[i].data.push({x:ts,y:v}));ch.options.scales.x.min=ts-CHART_WINDOW;ch.options.scales.x.max=ts;const c=ts-CHART_WINDOW*2;ch.data.datasets.forEach(ds=>{while(ds.data.length>0&&ds.data[0].x<c)ds.data.shift()});ch.update()}
function bulkCh(ch,ts,vs){if(!ch||!ts.length)return;ch.data.datasets.forEach((ds,i)=>{ds.data=ts.map((t,j)=>({x:t*1000,y:vs[i][j]||0}))});const l=ts[ts.length-1]*1000;ch.options.scales.x.min=l-CHART_WINDOW;ch.options.scales.x.max=l;ch.update('none')}
function loadHist(h){
if(!chartsReady||!h||!h.timestamps||!h.timestamps.length)return;
const ts=h.timestamps,len=ts.length;
if(charts.tp){let r=new Array(len).fill(0),w=new Array(len).fill(0);for(const d of Object.values(h.io||{})){if(d.read_bps)d.read_bps.forEach((v,i)=>r[i]+=v||0);if(d.write_bps)d.write_bps.forEach((v,i)=>w[i]+=v||0)}bulkCh(charts.tp,ts,[r,w])}
if(charts.iops){let r=new Array(len).fill(0),w=new Array(len).fill(0);for(const d of Object.values(h.io||{})){if(d.read_iops)d.read_iops.forEach((v,i)=>r[i]+=v||0);if(d.write_iops)d.write_iops.forEach((v,i)=>w[i]+=v||0)}bulkCh(charts.iops,ts,[r,w])}
if(charts.arcH&&h.arc_hit_rate&&h.arc_hit_rate.length)bulkCh(charts.arcH,ts,[h.arc_hit_rate,h.arc_hit_rate.map(v=>v!=null?+(100-v).toFixed(2):0)]);
if(charts.arcS&&h.arc_size&&h.arc_size.length)bulkCh(charts.arcS,ts,[h.arc_size,h.arc_size]);
if(h.temps&&Object.keys(h.temps).length){
const dn=Object.keys(h.temps).sort();if(!charts.temps)initTempChart(dn);
if(charts.temps){charts.temps.data.datasets.forEach(ds=>{const v=h.temps[ds.label]||[];const off=len-v.length;ds.data=v.map((val,j)=>({x:ts[off+j]*1000,y:val}))});if(ts.length){const l=ts[ts.length-1]*1000;charts.temps.options.scales.x.min=l-CHART_WINDOW;charts.temps.options.scales.x.max=l}charts.temps.update('none')}
}
}
function appendIO(rates,ts){if(!chartsReady||!rates)return;const t=ts*1000;let r=0,w=0,ri=0,wi=0;for(const d of Object.values(rates)){r+=d.read_bps||0;w+=d.write_bps||0;ri+=d.read_iops||0;wi+=d.write_iops||0}pushCh(charts.tp,t,[r,w]);pushCh(charts.iops,t,[ri,wi])}
function appendArc(arc,ts){if(!chartsReady||!arc)return;const t=ts*1000;pushCh(charts.arcH,t,[arc.hit_rate,+(100-arc.hit_rate).toFixed(2)]);pushCh(charts.arcS,t,[arc.size,arc.target_size])}
function appendTemps(temps,ts){
if(!temps||!Object.keys(temps).length)return;
if(!charts.temps)initTempChart(Object.keys(temps).sort());
if(!charts.temps)return;
const t=ts*1000;
charts.temps.data.datasets.forEach(ds=>{const v=temps[ds.label];if(v!=null)ds.data.push({x:t,y:v})});
charts.temps.options.scales.x.min=t-CHART_WINDOW;charts.temps.options.scales.x.max=t;
const c=t-CHART_WINDOW*2;charts.temps.data.datasets.forEach(ds=>{while(ds.data.length>0&&ds.data[0].x<c)ds.data.shift()});
charts.temps.update();
}
/* ── Fleet Rendering ─────────────────────────────────────────────────── */
function renderFleet(){
const el=document.getElementById('fleet-cards'),badge=document.getElementById('fleet-count');
if(!servers||!servers.length){el.innerHTML='<div class="empty">Waiting for agents to connect...</div>';badge.textContent='0';return}
badge.textContent=servers.length;
el.innerHTML=servers.map(s=>{
const on=s.online,up=s.total_usable>0?(s.total_used/s.total_usable*100):0;
const bc=up>90?'var(--red)':up>75?'var(--yellow)':'var(--accent)';
const upStr=fmtUptime(s.uptime_seconds||0);
const ac=s.alert_count||0;
return`<div class="card fleet-card${on?'':' offline'}" onclick="document.getElementById('server-select').value='${s.hostname}';switchServer('${s.hostname}')">
<div style="display:flex;justify-content:space-between;align-items:center;margin-bottom:.5rem">
<span style="font-weight:700;font-family:var(--mono);font-size:.95rem">${s.hostname}</span>
<span class="hb ${on?'h-on':'h-flt'}">${on?'ONLINE':'OFFLINE'}</span>
</div>
${s.total_usable>0?`<div style="font-size:.8rem;margin-bottom:.3rem">${B(s.total_used)} / ${B(s.total_usable)}</div><div class="stat-bar"><div class="stat-bar-fill" style="width:${up.toFixed(1)}%;background:${bc}"></div></div>`:'<div style="font-size:.8rem;color:var(--text2);margin-bottom:.3rem">No pool data</div>'}
<div style="font-size:.72rem;color:var(--text2);margin-top:.4rem">${s.disk_count} disk${s.disk_count!==1?'s':''} · ${s.pool_count} pool${s.pool_count!==1?'s':''} · <span style="color:${ac>0?'var(--red)':'var(--green)'}">${ac} alert${ac!==1?'s':''}</span></div>
${s.cpu_model?`<div style="font-size:.65rem;color:var(--text2);margin-top:.2rem">${s.cpu_model.substring(0,45)}</div>`:''}
${upStr?`<div style="font-size:.65rem;color:var(--text2)">Up ${upStr}</div>`:''}
</div>`}).join('');
}
function updateServerSelect(){
const sel=document.getElementById('server-select');
const cur=sel.value;
sel.innerHTML='<option value="">Fleet Overview</option>'+servers.map(s=>`<option value="${s.hostname}"${s.hostname===cur?' selected':''}>${s.hostname}${s.online?'':' (offline)'}</option>`).join('');
}
/* ── Detail Renderers ────────────────────────────────────────────────── */
function renderOverviewFromState(){
const disks=state.disks||[],pools=state.pools||[],alerts=state.alertsActive||[];
const s={total_raw:disks.reduce((a,d)=>a+(d.size||0),0),total_usable:pools.reduce((a,p)=>a+(p.size||0),0),total_used:pools.reduce((a,p)=>a+(p.allocated||0),0),total_free:pools.reduce((a,p)=>a+(p.free||0),0)};
const up=s.total_usable>0?(s.total_used/s.total_usable*100):0;
const bc=up>90?'var(--red)':up>75?'var(--yellow)':'var(--accent)';
document.getElementById('overview-cards').innerHTML=`
<div class="card"><div class="card-label">Raw Storage</div><div class="card-value">${B(s.total_raw)}</div><div class="card-sub">${disks.length} disk${disks.length!==1?'s':''}</div></div>
<div class="card"><div class="card-label">Used</div><div class="card-value">${B(s.total_used)}</div><div class="stat-bar"><div class="stat-bar-fill" style="width:${up.toFixed(1)}%;background:${bc}"></div></div><div class="card-sub">${pct(up)} of ${B(s.total_usable)}</div></div>
<div class="card"><div class="card-label">Free</div><div class="card-value">${B(s.total_free)}</div><div class="card-sub">${pools.length} pool${pools.length!==1?'s':''}</div></div>
<div class="card"><div class="card-label">Health</div><div class="card-value" style="color:${alerts.length>0?'var(--red)':'var(--green)'}">${alerts.length>0?alerts.length+' Alert'+(alerts.length>1?'s':''):'All Clear'}</div><div class="card-sub">${disks.filter(d=>d.health!==false).length}/${disks.length} disks · ${pools.filter(p=>p.health==='ONLINE').length}/${pools.length} pools</div></div>`;
}
function renderSystem(si){
const el=document.getElementById('system-cards');if(!el||!si)return;
const up=si.uptime_seconds||0,upStr=fmtUptime(up);
const ramUsed=(si.ram_total||0)-(si.ram_available||0);
const ramPct=si.ram_total>0?(ramUsed/si.ram_total*100):0;
const ramCol=ramPct>90?'var(--red)':ramPct>75?'var(--yellow)':'var(--accent)';
el.innerHTML=`
<div class="card"><div class="card-label">Kernel</div><div class="card-value" style="font-size:.85rem">${si.kernel||'—'}</div></div>
<div class="card"><div class="card-label">ZFS Version</div><div class="card-value" style="font-size:.85rem">${si.zfs_version||'—'}</div></div>
<div class="card"><div class="card-label">Uptime</div><div class="card-value" style="font-size:.85rem">${upStr||'—'}</div></div>
<div class="card"><div class="card-label">RAM</div><div class="card-value" style="font-size:.85rem">${B(ramUsed)} / ${B(si.ram_total)}</div><div class="stat-bar"><div class="stat-bar-fill" style="width:${ramPct.toFixed(1)}%;background:${ramCol}"></div></div><div class="card-sub">${si.cpu_count||0} cores · ${(si.cpu_model||'—').substring(0,40)}</div></div>`;
}
function renderPools(pools){
const el=document.getElementById('pool-cards'),badge=document.getElementById('pool-count-badge');
if(!pools||!pools.length){el.innerHTML='<div class="empty">No ZFS pools detected</div>';badge.textContent='0';return}
badge.textContent=pools.length;
el.innerHTML=pools.map(p=>{
const h=hc(p.health),up=p.size>0?(p.allocated/p.size*100):0,bc=up>90?'var(--red)':up>75?'var(--yellow)':'var(--accent)';
const scrubCol=p.scrub_age_days!=null?(p.scrub_age_days>14?'var(--red)':p.scrub_age_days>7?'var(--yellow)':'var(--green)'):'var(--text2)';
const scrubTxt=p.scrub_age_days!=null?p.scrub_age_days+'d ago':'N/A';
let vh='';
if(p.vdevs&&p.vdevs.length){vh=`<div class="vdev-tree"><div class="vr vh"><span>Name</span><span>State</span><span>Read</span><span>Write</span><span>Cksum</span></div>
${p.vdevs.map(v=>{const pad='&nbsp;'.repeat(Math.max(0,v.indent-1)*2);const rc=v.read!=='0'&&v.read!=='-'?'err':'zero';const wc=v.write!=='0'&&v.write!=='-'?'err':'zero';const cc=v.cksum!=='0'&&v.cksum!=='-'?'err':'zero';return`<div class="vr"><span>${pad}${v.name}</span><span class="${hc(v.state)}" style="font-size:.7rem">${v.state}</span><span class="${rc}">${v.read}</span><span class="${wc}">${v.write}</span><span class="${cc}">${v.cksum}</span></div>`}).join('')}</div>`}
return`<div class="card" style="padding:1rem">
<div class="pool-header"><span class="pool-name">${p.name}</span><span class="hb ${h}">${p.health}</span></div>
<div class="stat-bar" style="margin-bottom:.5rem"><div class="stat-bar-fill" style="width:${up.toFixed(1)}%;background:${bc}"></div></div>
<div class="pool-stats">
<div><div class="ps-l">Size</div><div class="ps-v">${B(p.size)}</div></div>
<div><div class="ps-l">Used</div><div class="ps-v">${B(p.allocated)}</div></div>
<div><div class="ps-l">Free</div><div class="ps-v">${B(p.free)}</div></div>
<div><div class="ps-l">Frag</div><div class="ps-v">${p.fragmentation}%</div></div>
<div><div class="ps-l">Dedup</div><div class="ps-v">${p.dedup}x</div></div>
<div><div class="ps-l">Ashift</div><div class="ps-v">${p.ashift||'—'}</div></div>
<div><div class="ps-l">Scrub</div><div class="ps-v" style="color:${scrubCol}">${scrubTxt}</div></div>
</div>
${p.scan?`<div style="font-size:.7rem;color:var(--text2);margin-bottom:.35rem">${p.scan}</div>`:''}
${p.errors_summary?`<div style="font-size:.7rem;color:${p.errors_summary.includes('No known')?'var(--text2)':'var(--red)'}">${p.errors_summary}</div>`:''}
${vh}</div>`}).join('');
}
function renderArc(arc){
const el=document.getElementById('arc-cards');
if(!arc){el.innerHTML='<div class="empty" style="grid-column:1/-1">ARC not available</div>';return}
el.innerHTML=`
<div class="card"><div class="card-label">Size</div><div class="card-value" style="font-size:1.2rem">${B(arc.size)}</div><div class="card-sub">Max ${B(arc.max_size)}</div></div>
<div class="card"><div class="card-label">Hit Rate</div><div class="card-value" style="font-size:1.2rem;color:${arc.hit_rate>90?'var(--green)':arc.hit_rate>70?'var(--yellow)':'var(--red)'}">${pct(arc.hit_rate)}</div><div class="card-sub">Lifetime: ${pct(arc.lifetime_hit_rate)}</div></div>
<div class="card"><div class="card-label">MRU / MFU</div><div class="card-value" style="font-size:.95rem">${B(arc.mru_size)}</div><div class="card-sub">MFU: ${B(arc.mfu_size)}</div></div>
<div class="card"><div class="card-label">L2ARC</div><div class="card-value" style="font-size:.95rem">${arc.l2_size>0?B(arc.l2_size):'N/A'}</div><div class="card-sub">${arc.l2_size>0?(arc.l2_hits||0).toLocaleString()+' hits':'None'}</div></div>`;
}
function renderDisks(disks){
const el=document.getElementById('disk-cards'),badge=document.getElementById('disk-count-badge');
if(!disks||!disks.length){el.innerHTML='<div class="empty">No disks detected</div>';badge.textContent='0';return}
badge.textContent=disks.length;
el.innerHTML=disks.map((d,i)=>{
const proto=(d.protocol||d.transport||'').toUpperCase();
let dtc='dtb';if(proto.includes('SAS')||proto.includes('SCSI'))dtc+=' dtb-sas';else if(proto.includes('NVME'))dtc+=' dtb-nv';
const ht=d.health===true?'PASSED':d.health===false?'FAILED':'N/A';
const hs=d.health_score!=null?d.health_score:'-';
return`<div class="card" style="padding:1rem">
<div class="disk-header"><div><span class="disk-name">/dev/${d.name}</span>${d.pool?`<span style="font-size:.65rem;color:var(--text2);margin-left:.35rem">${d.pool}</span>`:''}</div><div style="display:flex;gap:.25rem;align-items:center"><span class="score-badge" style="background:${scoreBg(hs)};color:${scoreCol(hs)}">${hs}</span><span class="${dtc}">${proto||'?'}</span><span class="hb ${d.health===true?'h-on':d.health===false?'h-flt':'h-unk'}">${ht}</span></div></div>
<div class="disk-stats">
<div class="ds"><div class="ds-l">Capacity</div><div class="ds-v">${B(d.user_capacity||d.size)}</div></div>
<div class="ds"><div class="ds-l">Temp</div><div class="ds-v ${tc(d.temperature)}">${d.temperature!=null?d.temperature+'°C':'—'}</div></div>
<div class="ds"><div class="ds-l">Power-On</div><div class="ds-v">${fmtH(d.power_on_hours)}</div></div>
<div class="ds"><div class="ds-l">RPM</div><div class="ds-v">${d.rotation_rate||'—'}</div></div>
</div>
<div style="font-size:.7rem;color:var(--text2);margin-bottom:.35rem"><strong>${d.device_model||d.model||'—'}</strong> · ${d.serial_number||d.serial||'—'} · FW ${d.firmware||'—'}</div>
<button class="btn btn-sm btn-ghost" onclick="openSmartModal(${i})">SMART Details</button></div>`}).join('');
}
function openSmartModal(idx){
const d=state.disks[idx];if(!d)return;
const proto=(d.protocol||d.transport||'').toUpperCase();
let dtc='dtb';if(proto.includes('SAS')||proto.includes('SCSI'))dtc+=' dtb-sas';else if(proto.includes('NVME'))dtc+=' dtb-nv';
const ht=d.health===true?'PASSED':d.health===false?'FAILED':'N/A';
const hs=d.health_score!=null?d.health_score:'-';
let smartHtml='';
if(d.smart_attributes&&d.smart_attributes.length){
smartHtml=`<div style="margin-top:.85rem"><div class="card-label" style="margin-bottom:.35rem">SMART Attributes</div>
<div style="max-height:360px;overflow-y:auto;border:1px solid var(--border);border-radius:6px"><table class="st"><thead><tr><th>ID</th><th>Attribute</th><th>Val</th><th>Wrst</th><th>Thr</th><th>Raw</th><th>Status</th></tr></thead><tbody>
${d.smart_attributes.map(a=>{let c='',s='✓';if(a.when_failed){c='attr-crit';s='✗ FAIL'}else if(a.value>0&&a.threshold>0&&a.value<=a.threshold){c='attr-warn';s='⚠ LOW'}const r=typeof a.raw==='number'?a.raw.toLocaleString():a.raw;const sg=SEAGATE.has(a.id);return`<tr class="${c}"><td>${a.id}</td><td>${a.name}${sg?' <span class="attr-note">(vendor)</span>':''}</td><td>${a.value}</td><td>${a.worst}</td><td>${a.threshold}</td><td>${r}</td><td style="color:${c==='attr-crit'?'var(--red)':c==='attr-warn'?'var(--yellow)':'var(--green)'}">${s}</td></tr>`}).join('')}
</tbody></table></div></div>`;
}
let sasHtml='';
if(d.sas_error_counters){
sasHtml='<div style="margin-top:.85rem"><div class="card-label" style="margin-bottom:.35rem">SAS Errors</div><table class="st"><thead><tr><th>Type</th><th>Uncorrected</th><th>Corrected</th></tr></thead><tbody>';
for(const[t,v]of Object.entries(d.sas_error_counters)){const u=v.total_uncorrected_errors||0;sasHtml+=`<tr><td>${t}</td><td class="${u>0?'attr-crit':''}">${u.toLocaleString()}</td><td>${(v.total_errors_corrected||0).toLocaleString()}</td></tr>`}
sasHtml+='</tbody></table></div>';
}
if(d.grown_defect_count!=null)sasHtml+=`<div style="margin-top:.4rem;font-size:.75rem"><strong>Grown Defects:</strong> <span style="color:${d.grown_defect_count>0?'var(--red)':'var(--green)'};font-weight:600">${d.grown_defect_count}</span></div>`;
document.getElementById('smart-modal-content').innerHTML=`
<h2><span>/dev/${d.name}</span><span class="score-badge" style="background:${scoreBg(hs)};color:${scoreCol(hs)}">${hs}/100</span><span class="${dtc}">${proto}</span><span class="hb ${d.health===true?'h-on':d.health===false?'h-flt':'h-unk'}">${ht}</span></h2>
<div class="si-grid">
<div class="si-item"><div class="si-l">Model</div><div class="si-v">${d.device_model||d.model||'—'}</div></div>
<div class="si-item"><div class="si-l">Serial</div><div class="si-v">${d.serial_number||d.serial||'—'}</div></div>
<div class="si-item"><div class="si-l">Firmware</div><div class="si-v">${d.firmware||'—'}</div></div>
<div class="si-item"><div class="si-l">Capacity</div><div class="si-v">${B(d.user_capacity||d.size)}</div></div>
<div class="si-item"><div class="si-l">Temp</div><div class="si-v ${tc(d.temperature)}">${d.temperature!=null?d.temperature+'°C':'—'}</div></div>
<div class="si-item"><div class="si-l">Power-On</div><div class="si-v">${fmtH(d.power_on_hours)}</div></div>
<div class="si-item"><div class="si-l">RPM</div><div class="si-v">${d.rotation_rate||'—'}</div></div>
<div class="si-item"><div class="si-l">Form</div><div class="si-v">${d.form_factor||'—'}</div></div>
${d.model_family?`<div class="si-item"><div class="si-l">Family</div><div class="si-v">${d.model_family}</div></div>`:''}
${d.pool?`<div class="si-item"><div class="si-l">Pool</div><div class="si-v">${d.pool}</div></div>`:''}
</div>
${smartHtml}${sasHtml}${!smartHtml&&!sasHtml?'<div class="empty">No SMART data</div>':''}
<div style="margin-top:.75rem;display:flex;gap:.4rem">
<button class="btn btn-sm btn-ghost" onclick="runSmartTest('/dev/${d.name}','short')">Short Self-Test</button>
<button class="btn btn-sm btn-ghost" onclick="runSmartTest('/dev/${d.name}','long')">Long Self-Test</button>
</div>`;
document.getElementById('smart-modal').classList.add('open');
}
function closeSmartModal(){document.getElementById('smart-modal').classList.remove('open')}
function runSmartTest(device,type){
if(!confirm(`Run ${type} SMART self-test on ${device}?`))return;
if(ws&&ws.readyState===1){ws.send(JSON.stringify({type:'smarttest',hostname:selectedServer,device,test_type:type}))}else{alert('Not connected.')}
}
function renderIOStats(rates,pm){
let tR=0,tW=0,tRI=0,tWI=0,tRL=0,tWL=0,active=0;const rows=[];
for(const[n,d]of Object.entries(rates)){tR+=d.read_bps||0;tW+=d.write_bps||0;tRI+=d.read_iops||0;tWI+=d.write_iops||0;rows.push({name:n,...d,pool:pm[n]||''})}
rows.sort((a,b)=>a.name.localeCompare(b.name));
for(const r of rows){if((r.busy_pct||0)>1){tRL+=r.read_lat_ms||0;tWL+=r.write_lat_ms||0;active++}}
const avgLat=active>0?((tRL+tWL)/(active*2)):0;
const se=document.getElementById('io-summary-cards');
if(se)se.innerHTML=`
<div class="card"><div class="card-label">Read</div><div class="card-value" style="font-size:1.15rem;color:var(--accent)">${Bps(tR)}</div><div class="card-sub">${tRI.toFixed(1)} IOPS</div></div>
<div class="card"><div class="card-label">Write</div><div class="card-value" style="font-size:1.15rem;color:var(--orange)">${Bps(tW)}</div><div class="card-sub">${tWI.toFixed(1)} IOPS</div></div>
<div class="card"><div class="card-label">Combined</div><div class="card-value" style="font-size:1.15rem">${Bps(tR+tW)}</div><div class="card-sub">${(tRI+tWI).toFixed(1)} IOPS</div></div>
<div class="card"><div class="card-label">Active</div><div class="card-value" style="font-size:1.15rem">${active}/${rows.length}</div><div class="card-sub">${avgLat>0?'Avg lat '+avgLat.toFixed(1)+'ms':'Idle'}</div></div>`;
const tb=document.getElementById('io-disk-tbody');
if(tb)tb.innerHTML=rows.map(r=>`<tr><td>${r.name}</td><td class="r">${Bps(r.read_bps)}</td><td class="r">${Bps(r.write_bps)}</td><td class="r">${(r.read_iops||0).toFixed(1)}</td><td class="r">${(r.write_iops||0).toFixed(1)}</td><td class="r" style="color:${latCol(r.read_lat_ms||0)}">${fmtLat(r.read_lat_ms||0)}</td><td class="r" style="color:${latCol(r.write_lat_ms||0)}">${fmtLat(r.write_lat_ms||0)}</td><td class="r" style="color:${busyCol(r.busy_pct||0)}">${(r.busy_pct||0).toFixed(1)}%</td><td>${r.pool||'—'}</td></tr>`).join('');
}
let dsSortCol='name',dsSortAsc=true;
function renderDatasets(ds){
const el=document.getElementById('ds-tbody');if(!ds||!ds.length){el.innerHTML='<tr><td colspan="7" class="empty">No datasets</td></tr>';return}
const s=[...ds].sort((a,b)=>{let va=a[dsSortCol],vb=b[dsSortCol];if(typeof va==='string')return dsSortAsc?va.localeCompare(vb):vb.localeCompare(va);return dsSortAsc?(va||0)-(vb||0):(vb||0)-(va||0)});
el.innerHTML=s.map(d=>`<tr><td>${d.name}</td><td class="r">${B(d.used)}</td><td class="r">${B(d.available)}</td><td class="r">${B(d.referenced)}</td><td>${d.compression}</td><td class="r">${d.compressratio}</td><td>${d.mountpoint||'—'}</td></tr>`).join('');
updateSortUI();
}
function updateSortUI(){document.querySelectorAll('#ds-table th.sort').forEach(th=>{const base=th.textContent.replace(/\s*[▲▼]$/,'');th.textContent=th.dataset.sort===dsSortCol?base+(dsSortAsc?' ▲':' ▼'):base})}
document.querySelectorAll('#ds-table th.sort[data-sort]').forEach(th=>{th.addEventListener('click',()=>{const c=th.dataset.sort;if(dsSortCol===c)dsSortAsc=!dsSortAsc;else{dsSortCol=c;dsSortAsc=true}renderDatasets(state.datasets)})});
function renderSnapshots(snaps){
const el=document.getElementById('snap-tbody'),badge=document.getElementById('snap-count-badge');
if(!snaps||!snaps.length){el.innerHTML='<tr><td colspan="4" class="empty">No snapshots</td></tr>';badge.textContent='0';return}
badge.textContent=snaps.length;
el.innerHTML=snaps.map(s=>{let cr=s.creation;if(typeof cr==='number')cr=new Date(cr*1000).toLocaleString();return`<tr><td>${s.name}</td><td class="r">${B(s.used)}</td><td class="r">${B(s.referenced)}</td><td>${cr}</td></tr>`}).join('');
}
function renderAlerts(data){
if(!data)return;
state.alertsActive=data.active||[];
state.alertLog=data.log||[];
document.getElementById('alert-active').innerHTML=state.alertsActive.length?state.alertsActive.map(a=>`<div class="alert-item alert-${a.severity}"><span>${a.message}</span></div>`).join(''):'<div style="font-size:.8rem;color:var(--text2);padding:.35rem">No active alerts</div>';
document.getElementById('alert-log').innerHTML=state.alertLog.length?state.alertLog.slice(0,40).map(a=>`<div class="alert-item alert-${a.severity}"><span class="alert-time">${relTime(a.timestamp)}</span><span>${a.message}</span></div>`).join(''):'<div style="font-size:.8rem;color:var(--text2);padding:.35rem">No alerts logged</div>';
}
/* ── WebSocket ───────────────────────────────────────────────────────── */
function connectWS(){
const proto=location.protocol==='https:'?'wss:':'ws:';
ws=new WebSocket(`${proto}//${location.host}/ws`);
ws.onopen=()=>{
document.getElementById('conn-dot').style.background='var(--green)';
document.getElementById('conn-dot').title='Connected';
if(selectedServer)ws.send(JSON.stringify({type:'subscribe',hostname:selectedServer}));
};
ws.onclose=()=>{
document.getElementById('conn-dot').style.background='var(--red)';
document.getElementById('conn-dot').title='Disconnected';
setTimeout(connectWS,3000);
};
ws.onerror=()=>{};
ws.onmessage=e=>{
try{handleMessage(JSON.parse(e.data))}catch(err){console.error('WS parse:',err)}
};
}
function handleMessage(msg){
switch(msg.type){
case 'servers':
servers=msg.servers;
updateServerSelect();
if(!selectedServer)renderFleet();
break;
case 'settings':
settingsData=msg.settings||{};
break;
case 'full_state':
loadFullState(msg);
break;
case 'io':
if(msg.hostname===selectedServer){
state.ioRates=msg.rates||{};state.poolMap=msg.pool_map||{};
renderIOStats(msg.rates||{},msg.pool_map||{});
appendIO(msg.rates||{},msg.ts);
appendTemps(msg.temps||{},msg.ts);
}
break;
case 'arc':
if(msg.hostname===selectedServer){
state.arc=msg.arc;
renderArc(msg.arc);
appendArc(msg.arc,msg.ts);
}
break;
case 'pools':
if(msg.hostname===selectedServer){state.pools=msg.pools||[];renderPools(state.pools);renderOverviewFromState()}
break;
case 'disks':
if(msg.hostname===selectedServer){state.disks=msg.disks||[];renderDisks(state.disks);renderOverviewFromState()}
break;
case 'datasets':
if(msg.hostname===selectedServer){state.datasets=msg.datasets||[];renderDatasets(state.datasets)}
break;
case 'snapshots':
if(msg.hostname===selectedServer){state.snapshots=msg.snapshots||[];renderSnapshots(state.snapshots)}
break;
case 'system':
if(msg.hostname===selectedServer)renderSystem(msg.info||{});
break;
case 'alerts':
if(msg.hostname===selectedServer)renderAlerts({active:msg.active||[],log:msg.log||[]});
break;
case 'smarttest_result':
if(msg.hostname===selectedServer)alert(msg.success?`${msg.test_type} test started on ${msg.device}.`:`Failed: ${msg.output||'Unknown error'}`);
break;
case 'test_notification_result':
alert(msg.success?'Notification sent!':'Failed. Check URL & token.');
break;
}
}
function loadFullState(msg){
const c=msg.current||{},h=msg.history||{};
document.title='ZPulse — '+(msg.hostname||'');
if(c.pools&&c.pools.pools){state.pools=c.pools.pools;renderPools(state.pools)}
if(c.disks&&c.disks.disks){state.disks=c.disks.disks;renderDisks(state.disks)}
if(c.datasets&&c.datasets.datasets){state.datasets=c.datasets.datasets;renderDatasets(state.datasets)}
if(c.snapshots&&c.snapshots.snapshots){state.snapshots=c.snapshots.snapshots;renderSnapshots(state.snapshots)}
if(c.system&&c.system.info)renderSystem(c.system.info);
if(c.io&&c.io.rates){state.ioRates=c.io.rates;state.poolMap=c.io.pool_map||{};renderIOStats(c.io.rates,c.io.pool_map||{})}
if(c.arc&&c.arc.arc){state.arc=c.arc.arc;renderArc(c.arc.arc)}
renderOverviewFromState();
if(h&&h.timestamps&&h.timestamps.length)loadHist(h);
if(msg.alerts)renderAlerts(msg.alerts);
}
function switchServer(hostname){
selectedServer=hostname||null;
if(selectedServer){
document.getElementById('fleet').style.display='none';
document.getElementById('server-detail').style.display='';
document.getElementById('detail-nav').style.display='flex';
document.title='ZPulse — '+selectedServer;
state.disks=[];state.pools=[];state.datasets=[];state.snapshots=[];state.arc=null;state.alertsActive=[];state.alertLog=[];
destroyCharts();initCharts();
if(ws&&ws.readyState===1)ws.send(JSON.stringify({type:'subscribe',hostname:selectedServer}));
}else{
document.getElementById('fleet').style.display='';
document.getElementById('server-detail').style.display='none';
document.getElementById('detail-nav').style.display='none';
document.title='ZPulse — Fleet';
destroyCharts();
renderFleet();
}
}
/* ── Settings ────────────────────────────────────────────────────────── */
function loadSettingsUI(){
const s=settingsData;
document.getElementById('set-gotify-url').value=s.gotify_url||'';
document.getElementById('set-gotify-token').value=s.gotify_token||'';
document.getElementById('set-temp-warn').value=s.alert_temp_warning||45;
document.getElementById('set-temp-crit').value=s.alert_temp_critical||55;
document.getElementById('set-space-warn').value=s.alert_space_warning||80;
document.getElementById('set-space-crit').value=s.alert_space_critical||90;
document.getElementById('set-cooldown').value=s.alert_cooldown||3600;
document.getElementById('set-smart-alerts').checked=s.alert_smart_enabled!==false;
document.getElementById('set-pool-alerts').checked=s.alert_pool_enabled!==false;
}
function openSettings(){loadSettingsUI();document.getElementById('settings-modal').classList.add('open')}
function closeSettings(){document.getElementById('settings-modal').classList.remove('open')}
function saveSettings(){
const b={gotify_url:document.getElementById('set-gotify-url').value.trim(),gotify_token:document.getElementById('set-gotify-token').value.trim(),alert_temp_warning:parseInt(document.getElementById('set-temp-warn').value)||45,alert_temp_critical:parseInt(document.getElementById('set-temp-crit').value)||55,alert_space_warning:parseInt(document.getElementById('set-space-warn').value)||80,alert_space_critical:parseInt(document.getElementById('set-space-crit').value)||90,alert_cooldown:parseInt(document.getElementById('set-cooldown').value)||3600,alert_smart_enabled:document.getElementById('set-smart-alerts').checked,alert_pool_enabled:document.getElementById('set-pool-alerts').checked};
if(ws&&ws.readyState===1)ws.send(JSON.stringify({type:'save_settings',settings:b}));
closeSettings();
}
function testNotification(){
saveSettings();
if(ws&&ws.readyState===1)ws.send(JSON.stringify({type:'test_notification'}));
}
/* ── Init ────────────────────────────────────────────────────────────── */
document.addEventListener('keydown',e=>{if(e.key==='Escape'){closeSettings();closeSmartModal()}});
connectWS();
</script>
</body>
</html>