Troubleshooting
I try to document when things does not work as I intended them to.
General "Rule of Thumb" Workflow
- Check State:
docker ps -a(Is it restarting? Exited?) - Check Logs:
docker logs <container_name> --tail 50(Read the actual error) - Check Resources:
docker stats --no-stream(Is CPU/RAM spiked?) - Check Connections:
curl -v http://<ip>:<port>(Is the port actually open?)
High CPU / Resource Usage
Symptom System fans spin up, load average spikes, or UI becomes unresponsive.
Check
Use docker stats to identify the offending container. The --no-stream flag gives a clean snapshot instead of a jumping live feed.
docker stats --no-stream
Case Study: The "Log Flood" (CrowdSec)
- Observation: CrowdSec using 400% CPU. Logs show rapid processing of internal IP addresses.
- Diagnosis: Caddy was logging internal health checks (from Homepage), flooding the parser.
- The Fix: Configure Caddy to skip logging for internal IPs.
- Code:
log_skip @internalinCaddyfile.
- Code:
Case Study: The "Engine Bottleneck" (CrowdSec)
- Observation: Logs are quiet (no flood), but CrowdSec CPU is still high (\~500%).
- Diagnosis: Default CrowdSec config is single-threaded. On my CPU (Ryzen 7600X), this creates a queue backlog.
- The Fix:
- Parallelization: Edit
crowdsec/config/config.yamlto increaseparser_routinesto6(matching CPU cores). - Polling Frequency: Edit
Caddyfileto increaseticker_intervalto60s(reduces how often Caddy asks the Agent for updates).
- Parallelization: Edit
Connection Refused / Service Down
Symptom
A service (like Portainer or WUD) cannot connect to another service (like Socket Proxy), showing ECONNREFUSED or ENOTFOUND.
Check
Check if the target is actually listening on the expected port from inside the network.
# 1. Check if the target is running
docker ps | grep socket-proxy
# 2. Check container logs for startup errors
docker logs socket-proxy
# 3. Verify Internal DNS resolution (from another container)
docker exec -it wud ping socket-proxy
Case Study: WUD & Portainer vs. Socket Proxy
- Observation: WUD logs showed
getaddrinfo ENOTFOUND socket-proxy. - Diagnosis: Docker's internal DNS sometimes fails to resolve container names immediately on boot.
- The Fix: Switch from Hostname (
socket-proxy) to Static IP (172.20.0.28).- Config:
WUD_WATCHER_LOCAL_HOST=172.20.0.28
- Config:
Storage / Disk Missing in Dashboard
Symptom
Homepage reports Drive not found for target: /mnt/<name>_disk or Beszel shows 0 Disk I/O.
Check
Verify what the container actually sees mounted.
docker exec homepage df -h
Case Study: The "Ghost" Mount (Homepage)
- Observation:
df -hinside the container did NOT show the mount, even thoughcomposehad it. - Diagnosis: Startup Race Condition. Docker started before the OS finished mounting the LVM drive.
- The Fix: Point the widget to
/app/config(internal path) instead of an external/mntpath.
Case Study: Beszel "Zero Speed"
- Observation: Beszel showed Disk Usage (%) but 0 MB/s Read/Write.
- Diagnosis: Docker mounts (
- /mnt/pool01:/data) abstract the file system. The container cannot see the Kernel Device (dm-0). - The Fix: Use Device Mapper mounting.
- Config:
/mnt/pool01/media/.beszel:/extra-filesystems/dm-2__Media:ro
External Access Fails (Mobile Only)
Symptom
Jellyfin works on LAN (Wi-Fi) and Desktop Browser (4G), but fails on Android App (4G).
Tools: SSL Labs Server Test (ssllabs.com).
Case Study: The IPv6 Trap
- Observation: SSL Labs showed IPv6 test failing.
- Diagnosis: 4G Mobile networks prioritize IPv6. Cloudflare was publishing an AAAA record, but our host wasn't routing IPv6 ingress correctly.
- The Fix: Deleted AAAA records in Cloudflare to force IPv4.
Homepage Issues
Issue: "API Error" on Storage Widgets
- Fix: Ensure volume is mounted (
- /mnt/pool01/media:/mnt/media_disk:ro).
Issue: "Host validation failed"
- Fix: Set
HOMEPAGE_ALLOWED_HOSTS=*incompose.
Issue: CrowdSec Widget Error
- Fix: If DB was wiped, update
.envwith new credentials fromcrowdsec/config/local_api_credentials.yaml.
WUD (What's Up Docker) Issues
Issue : Duplicate Containers / "Ghosts"
- Symptom: WUD shows 2 entries for every container (one "Local", one "Proxy").
- Cause: Defining
WUD_WATCHER_PROXY_...creates a second watcher, while the defaultlocalwatcher still tries to run. - Fix: "Hijack" the local watcher by setting
WUD_WATCHER_LOCAL_HOST=172.20.0.28and removing all Proxy variables.
Issue: Updates not showing
- Fix: WUD scans on a CRON schedule. Restart the container (
docker restart wud) to force an immediate re-scan.
Issue: Shutdown Corruption (Exit Code 137)
- Symptom: WUD took
10.2sto stop and messed up its DB. - Diagnosis: Docker's default kill timer is 10s. Heavy apps need more time to save state.
- Fix: Added
"shutdown-timeout": 30to/etc/docker/daemon.json
GoAccess WebSocket Errors
Symptom
Dashboard loads, but the connection icon is Red/Disconnected.
Case Study: Origin Mismatch
- Observation: Browser console showed
400 Bad Request. - Diagnosis: GoAccess has strict security. The URL in the browser (
http://192.168.x.x) did not match the--originflag in the container. - Fix: Ensure
--originmatches the exact URL we use to visit the dashboard.