Application Watchdog Bot
In continuing with my SDRTrunk Discord project, I decided I needed a watchdog script to tell me when shiz goes south, and obviously, it needs to go to Discord. In my optimization craze, I elected to go with the shipper/listener concept, where a small shipping client runs on the VM to reduce overhead instead of processing all the logs on the VM. This shipper pushes info to my scripts LXC where a listener is set up to do all the heavy lifting. I also opted for a single channel in Discord, and of course, use threads to keep things well organized. It will also use the bot currently running the file parser, so no need to create a second bot just for this, unless you wanna.
-
đ Lifecycle & Launches Thread
-
This thread would be dedicated to the most critical information: is the application running, did it start correctly, and did it shut down cleanly?
- Successful Application Starts (from sdrtrunk_app.log)
- Clean Shutdowns (from sdrtrunk_app.log)
-
-
đĨ Errors & Critical Issues Thread
- Audio Output Failures (from sdrtrunk_app.log)
- Tuner Calibration Failures (from sdrtrunk_app.log)
- Runtime Null Pointer Exceptions (from sdrtrunk_app.log)
-
â ī¸ Warnings & Performance Thread
- Tuner Contention (from sdrtrunk_app.log)
- PPM Auto-Correction (from
sdrtrunk_app.log)
Ignore the Noise! Just as important is knowing what not to post, to keep the channels clean.
- The entire startup banner: All the lines between SDRTrunk Version and Host OS Name are just static information. We can grab a few key details for the initial "Started" message, but we don't need to post all of them every time.
- Calibration Logs: The long stream of COMPLEX OSCILLATOR, FM DEMODULATOR, FIR FILTER messages are part of a one-time calibration routine. These are very noisy and not useful for day-to-day monitoring.
- Unknown Packet SAP: RESERVED 1: This INFO message is likely for debugging and doesn't represent a problem you need to act on.
Set Up and Run It
This is in two parts, where the first is on the he scripts LXC, and the second is on the Windows 11 VM running SDRTrunk.
Set Up Scripts LXC
Create the Directory
sudo mkdir -p /opt/sdrtrunk-watchdog
sudo chown your_user:your_user /opt/sdrtrunk-watchdog # Replace with your username
cd /opt/sdrtrunk-watchdog
Create the Python Environment
python3 -m venv venv
source venv/bin/activate
Install the Libraries
pip install python-dotenv requests
With that done, I usually exit the venv to make my files...
deactivate
Listener.py
This file will go in the working directory /opt/sdrtrunk-watchdog
#!/usr/bin/env python3
# ==================================================================#
# SDR-Trunk-Discord Project v20251005.01.01 #
# Spartan X311 https://skynet2.net #
# All Your Comms Are Now Belong To Me #
# SDRTrunk Watchdog Š 2025 by Spartan X311 is licensed #
# under CC BY-SA 4.0. To view a copy of this license, visit #
# https://creativecommons.org/licenses/by-sa/4.0/ #
# #
# listener.py #
# ==================================================================#
import os
import re
import time
import queue
import socket
import logging
import threading
from datetime import datetime, timezone
from dotenv import load_dotenv
import requests
# =======================
# Configuration & Logging
# =======================
load_dotenv()
LOGS_DIR = os.getenv("LOGS_DIR", ".").strip() or "."
os.makedirs(LOGS_DIR, exist_ok=True)
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s [%(levelname)s] %(message)s",
handlers=[
logging.FileHandler(os.path.join(LOGS_DIR, "listener.log"), encoding="utf-8"),
logging.StreamHandler()
]
)
log = logging.getLogger("sdr-listener")
# --- Network Listener Config ---
LISTENER_HOST = os.getenv("LISTENER_HOST", "0.0.0.0")
LISTENER_PORT = int(os.getenv("LISTENER_PORT", "9999"))
# --- Discord Config ---
DISCORD_BOT_TOKEN = os.getenv("DISCORD_BOT_TOKEN", "").strip()
if not DISCORD_BOT_TOKEN:
raise SystemExit("DISCORD_BOT_TOKEN missing from .env file")
LIFECYCLE_THREAD_ID = os.getenv("LIFECYCLE_THREAD_ID", "").strip() # đ Lifecycle & Launches
ERRORS_THREAD_ID = os.getenv("ERRORS_THREAD_ID", "").strip() # đĨ Errors & Critical Issues
PERFORMANCE_THREAD_ID = os.getenv("PERFORMANCE_THREAD_ID", "").strip() # â ī¸ Warnings & Performance
# --- Log Filtering ---
APP_LOG_MIN_LEVEL = os.getenv("APP_LOG_MIN_LEVEL", "ERROR").strip().upper()
LEVEL_ORDER = ["TRACE", "DEBUG", "INFO", "WARN", "ERROR", "FATAL", "SEVERE", "CRITICAL"]
def sev_idx(level: str) -> int: return LEVEL_ORDER.index(level) if level in LEVEL_ORDER else -1
MIN_IDX = sev_idx(APP_LOG_MIN_LEVEL) if APP_LOG_MIN_LEVEL in LEVEL_ORDER else sev_idx("ERROR")
# =======================
# Discord Poster
# =======================
class Discord:
def __init__(self, token: str):
self.session = requests.Session()
self.session.headers.update({"Authorization": f"Bot {token}", "Content-Type": "application/json"})
self.post_q = queue.Queue()
self.worker = threading.Thread(target=self._worker, daemon=True)
self.worker.start()
def _worker(self):
while True:
item = self.post_q.get()
if item is None: break
try:
self._post_impl(**item)
except Exception as e:
log.error("Discord post failed: %s", e)
finally:
self.post_q.task_done()
def enqueue_embed(self, thread_id: str, embed: dict):
if not thread_id or not thread_id.isdigit():
log.warning("Attempted to post to invalid thread_id: %s", thread_id)
return
self.post_q.put({"thread_id": thread_id, "embeds": [embed]})
def _post_impl(self, thread_id: str, embeds=None):
url = f"https://discord.com/api/v10/channels/{thread_id}/messages"
body = {"embeds": embeds}
for _ in range(5): # 5 retries
try:
r = self.session.post(url, json=body, timeout=20)
if r.status_code == 429:
retry_after = r.json().get("retry_after", 1)
log.warning("Rate limited. Waiting %s seconds.", retry_after)
time.sleep(retry_after)
continue
if r.ok: return
log.error("Discord post failed [%s]: %s", r.status_code, r.text)
except requests.RequestException as e:
log.error("Network error during Discord post: %s", e)
time.sleep(1) # Wait before retrying on failure
discord = Discord(DISCORD_BOT_TOKEN)
# =======================
# Log Parsing & Embeds
# =======================
# For sdrtrunk_app.log
LINE_RE = re.compile(
r"^(?P<date>\d{8})\s+(?P<time>\d{6}\.\d{3})\s+\[(?P<thr>[^\]]+)\]\s+(?P<level>[A-Z]+)\s+(?P<class>\S+)\s*-\s*(?P<msg>.*)$"
)
# For launch-log.txt
LAUNCH_TS_RX = re.compile(r"^\[(?P<ts>[\d\-:\.\s]+)\]\s+Launching SDRTrunk")
KV_RX = re.compile(r"^\s*(?P<key>Username|Session|Running from|JAVA_EXE resolved to|Launching)\s*:\s*(?P<val>.*)$")
FAILED_RX = re.compile(r"^\s*FAILED with ERRORLEVEL=(?P<code>\d+)\s*$")
# --- Debouncing to prevent spam ---
debounce_cache = {}
DEBOUNCE_PERIOD_SEC = 3600 # 1 hour
def is_debounced(key: str) -> bool:
now = time.time()
if key in debounce_cache and (now - debounce_cache[key]) < DEBOUNCE_PERIOD_SEC:
return True
debounce_cache[key] = now
return False
def color_for_level(level: str) -> int:
level = (level or "").upper()
if level == "INFO": return 0x3498DB # blue
if level == "WARN": return 0xF1C40F # yellow
if level in ("ERROR", "FATAL", "SEVERE", "CRITICAL"): return 0xE74C3C # red
return 0x95A5A6 # grey default
def create_embed(title: str, description: str, color: int, fields=None):
return {
"title": title,
"description": f"```\n{description[:4000]}\n```",
"color": color,
"fields": fields or [],
"timestamp": datetime.now(timezone.utc).isoformat()
}
# =======================
# Log Processing Logic
# =======================
def process_log_line(line: str, source: str):
# --- LAUNCH-LOG.TXT Processing ---
if source == "launch-log.txt":
if LAUNCH_TS_RX.match(line):
embed = create_embed("đ Launch Attempt", line, 0x3498DB)
discord.enqueue_embed(LIFECYCLE_THREAD_ID, embed)
elif m := FAILED_RX.search(line):
embed = create_embed("â Launch FAILED", f"{line}\nErrorlevel: {m.group('code')}", 0xE74C3C)
discord.enqueue_embed(LIFECYCLE_THREAD_ID, embed)
return
# --- SDRTRUNK_APP.LOG Processing ---
parts = LINE_RE.match(line)
if not parts:
return # Not a standard log line (e.g., part of a stack trace, which we'll handle separately)
msg = parts.group("msg")
level = parts.group("level")
klass = parts.group("class")
# -- Lifecycle Events --
if "SDRTrunk Version" in msg:
embed = create_embed("â
SDRTrunk Started", line, 0x2ECC71)
discord.enqueue_embed(LIFECYCLE_THREAD_ID, embed)
elif "Shutdown complete" in msg:
embed = create_embed("đ SDRTrunk Shutdown", line, 0xE74C3C)
discord.enqueue_embed(LIFECYCLE_THREAD_ID, embed)
# -- Performance/Warning Events --
elif "Unable to source channel" in msg:
if not is_debounced("tuner_contention"):
embed = create_embed("â ī¸ Tuner Contention", line, color_for_level("WARN"))
discord.enqueue_embed(PERFORMANCE_THREAD_ID, embed)
elif "Auto-Correcting Tuner PPM" in msg:
embed = create_embed("âšī¸ PPM Auto-Correction", line, color_for_level("INFO"))
discord.enqueue_embed(PERFORMANCE_THREAD_ID, embed)
# -- Error Events --
elif level == "ERROR":
if "Calibration NOT successful" in msg:
embed = create_embed("đĨ Tuner Calibration Failed", line, color_for_level("ERROR"))
discord.enqueue_embed(ERRORS_THREAD_ID, embed)
elif "playback.AudioOutput" in klass:
if not is_debounced("audio_output_error"):
embed = create_embed("đĨ Audio Output Error", line, color_for_level("ERROR"))
discord.enqueue_embed(ERRORS_THREAD_ID, embed)
# Generic error catch-all
elif sev_idx(level) >= MIN_IDX:
embed = create_embed(f"đĨ {level}: {klass}", line, color_for_level(level))
discord.enqueue_embed(ERRORS_THREAD_ID, embed)
# =======================
# Main Network Listener
# =======================
def main():
log.info("Starting SDRTrunk Log Listener on %s:%d", LISTENER_HOST, LISTENER_PORT)
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind((LISTENER_HOST, LISTENER_PORT))
s.listen()
log.info("Listener is waiting for a connection from the shipper...")
while True:
try:
conn, addr = s.accept()
with conn:
log.info("Connected by %s", addr)
buffer = ""
while True:
data = conn.recv(1024)
if not data:
log.info("Connection from %s closed.", addr)
break
buffer += data.decode('utf-8', errors='ignore')
while '\n' in buffer:
line, buffer = buffer.split('\n', 1)
line = line.strip()
if '::' in line:
source, content = line.split('::', 1)
process_log_line(content, source)
except ConnectionResetError:
log.warning("Shipper disconnected abruptly. Waiting for new connection.")
except Exception as e:
log.error("An error occurred in the listener loop: %s", e)
time.sleep(5) # Avoid rapid-fire error loops
if __name__ == "__main__":
main()
.env File
This will sit next to the listener.py file
# ==================================================================#
# SDR-Trunk-Discord Project v20251005.01.01 #
# Spartan X311 https://skynet2.net #
# All Your Comms Are Now Belong To Me #
# SDRTrunk Watchdog Š 2025 by Spartan X311 is licensed #
# under CC BY-SA 4.0. To view a copy of this license, visit #
# https://creativecommons.org/licenses/by-sa/4.0/ #
# #
# .env #
# ==================================================================#
# Discord Bot Token from the Discord Developer Portal
DISCORD_BOT_TOKEN="your-bot-token-here"
# The IP address for the listener to bind to. 0.0.0.0 means it will listen on all available network interfaces.
LISTENER_HOST="0.0.0.0"
LISTENER_PORT="9999"
# Minimum log level to report for generic errors (INFO, WARN, ERROR)
APP_LOG_MIN_LEVEL="ERROR"
# Discord Thread IDs - PASTE THE IDs YOU CREATED
LIFECYCLE_THREAD_ID="your-thread-id-here"
ERRORS_THREAD_ID="your-thread-id-here"
PERFORMANCE_THREAD_ID="your-thread-id-here"
Set Up Windows VM
I decided to keep this in the SDRTrunk directory, under a folder called Python Scripts, feel free to move that wherever you want.
shipper.py
# ==================================================================#
# SDR-Trunk-Discord Project v20251005.01.01 #
# Spartan X311 https://skynet2.net #
# All Your Comms Are Now Belong To Me #
# SDRTrunk Watchdog Š 2025 by Spartan X311 is licensed #
# under CC BY-SA 4.0. To view a copy of this license, visit #
# https://creativecommons.org/licenses/by-sa/4.0/ #
# #
# shipper.py #
# ==================================================================#
import os
import time
import socket
import pathlib
import threading
import logging
# =======================
# Configuration
# =======================
# !!! IMPORTANT: Change this to the IP address of your LXC !!!
LISTENER_HOST = "192.168.0.165" # <--- REPLACE WITH YOUR LXC's IP
LISTENER_PORT = 9999
# --- Paths to your SDRTrunk log files ---
SDRTRUNK_DIR = pathlib.Path.home() / "OneDrive" / "SDR" / "SDR Trunk"
APP_LOG_DIR = SDRTRUNK_DIR / "logs"
LAUNCH_LOG_PATH = SDRTRUNK_DIR / "launch-log.txt"
logging.basicConfig(level=logging.INFO, format="%(asctime)s [%(levelname)s] %(message)s")
log = logging.getLogger("sdr-shipper")
# =======================
# Main Shipper Logic
# =======================
def tail_and_send(filepath: pathlib.Path, sock: socket.socket):
log.info("Tailing file: %s", filepath)
filename = filepath.name
try:
with filepath.open('r', encoding='utf-8', errors='ignore') as f:
f.seek(0, 2) # Go to the end of the file
while True:
line = f.readline()
if not line:
time.sleep(0.2)
continue
line = line.strip()
if line:
# Prepend the source filename for the listener
message = f"{filename}::{line}\n".encode('utf-8')
try:
sock.sendall(message)
except (BrokenPipeError, ConnectionResetError) as e:
log.error("Connection lost to listener: %s. Will attempt to reconnect.", e)
return # Exit function to trigger reconnection
except FileNotFoundError:
log.warning("File not found: %s. Will retry.", filepath)
time.sleep(5)
except Exception as e:
log.error("Error tailing file %s: %s", filepath, e)
def find_latest_log(log_dir: pathlib.Path) -> pathlib.Path:
# The current log is simply 'sdrtrunk_app.log' without a date
current_log = log_dir / "sdrtrunk_app.log"
if current_log.exists():
return current_log
# Fallback to finding the most recently modified log file if the main one isn't there
log.warning("'sdrtrunk_app.log' not found, searching for latest dated log...")
all_logs = list(log_dir.glob("*.log"))
if not all_logs:
return None
return max(all_logs, key=lambda p: p.stat().st_mtime)
def main():
while True: # Main reconnection loop
try:
with socket.create_connection((LISTENER_HOST, LISTENER_PORT), timeout=10) as sock:
log.info("Successfully connected to listener at %s:%d", LISTENER_HOST, LISTENER_PORT)
# Find the correct application log file to tail
app_log_path = find_latest_log(APP_LOG_DIR)
if app_log_path:
# We will tail the two log files in parallel using threads
app_thread = threading.Thread(target=tail_and_send, args=(app_log_path, sock), daemon=True)
launch_thread = threading.Thread(target=tail_and_send, args=(LAUNCH_LOG_PATH, sock), daemon=True)
app_thread.start()
launch_thread.start()
# Keep the main thread alive while the tailing threads run
app_thread.join()
launch_thread.join()
else:
log.error("No application logs found in %s. Retrying in 30 seconds.", APP_LOG_DIR)
time.sleep(30)
except (socket.timeout, ConnectionRefusedError):
log.warning("Could not connect to listener. Retrying in 15 seconds...")
time.sleep(15)
except Exception as e:
log.error("An unexpected error occurred: %s. Retrying in 15 seconds...", e)
time.sleep(15)
if __name__ == "__main__":
main()
Run It All
systemd Service
On the scritps LXC I opted to use systemd to handle starting and stopping this
nano /etc/systemd/system/sdr-listener.service
[Unit]
Description=SDRTrunk Log Listener
After=network.target
[Service]
User=root
WorkingDirectory=/opt/sdrtrunk-watchdog
ExecStart=/opt/sdrtrunk-watchdog/venv/bin/python /opt/sdrtrunk-watchdog/listener.py
Restart=always
[Install]
WantedBy=multi-user.target
# Reload systemd to recognize the new/changed file
systemctl daemon-reload
# Enable the service to start on boot
systemctl enable sdr-listener
# Start the service now
systemctl start sdr-listener
# Check that it's running correctly
systemctl status sdr-listener
Windows bat File & Startup Shortcut
On the W11 VM I decided to make a .bat file to handle the start/stop, which would allow me to run additional stuff if I add more later. I then will just create a shortcut to that .bat file and place it in the startup folder so it's run when the system starts up. This is also how I run my SDRTrunk application, but it has it's own .bat file
startup.bat
@echo off
setlocal
set "PROJECT_DIR=C:\Users\Administrator\OneDrive\SDR\SDR Trunk\Python Scripts"
set "VENV_PY=%PROJECT_DIR%\.venv\Scripts\python.exe"
set "LOG_DIR=%PROJECT_DIR%\logs"
set "PYTHONUTF8=1"
set "PYTHONUNBUFFERED=1"
if not exist "%LOG_DIR%" mkdir "%LOG_DIR%"
cd /d "%PROJECT_DIR%"
rem Use cmd /c so >> redirection belongs to the child, not START
start "shipper.py" /min cmd /c ""%VENV_PY%" "%PROJECT_DIR%\shipper.py""
echo.
echo SDR-Trunk-Discord suite started. Logs: %LOG_DIR%
pause
Add it To Startup Folder
Ctl + Rto open the Run command box- Enter
shell:startuphit Enter - Place a shortcut of your bat in that folder
- Close it, done! (You may want to start your bat file now anyhow)
With that, you should now see stuff in the console like this:
2025-10-05 20:15:02,143 [INFO] Successfully connected to listener at 192.168.0.165:99992025-10-05 20:15:02,148 [INFO] Tailing file: C:\Users\Administrator\OneDrive\SDR\SDR Trunk\logs\sdrtrunk_app.log2025-10-05 20:15:02,149 [INFO] Tailing file: C:\Users\Administrator\OneDrive\SDR\SDR Trunk\launch-log.txt
If you see all that, you should also see the reverse of that in the scripts LXC. You can use this to tail the systemd service:
journalctl -u sdr-listener.service -f
and get this:
Oct 05 20:24:54 scripts systemd[1]: Started sdr-listener.service - SDRTrunk Log Listener.
Oct 05 20:24:54 scripts python[23001]: 2025-10-05 20:24:54,756 [INFO] Starting SDRTrunk Log Listener on 0.0.0.0:9999
Oct 05 20:24:54 scripts python[23001]: 2025-10-05 20:24:54,756 [INFO] Listener is waiting for a connection from the shipper...