Operational runbook
Purpose
Section titled “Purpose”This runbook supports on-call and release work for the leekimerp Frappe app. It complements Deployment & operations with step-by-step procedures. Adapt hostnames, paths, and supervisor unit names to your environment.
Scope: Frappe site running leekimerp + ERPNext. Out of scope: bare-metal OS hardening (use your platform standards).
Roles and prerequisites
Section titled “Roles and prerequisites”| Role | Needs |
|---|---|
| Operator | SSH to Bench host, bench CLI, sudo for service restart if applicable |
| Developer | Git access, knowledge of hooks.py / patches.txt |
Confirm site name (bench use or --site) before destructive commands.
Healthy release: deploy new app code
Section titled “Healthy release: deploy new app code”-
Announce maintenance if users are online; integrations may pause briefly.
-
Backup (see below) before any production migrate.
-
Pull application code in
apps/leekimerpper your git workflow. -
Run migrations:
Terminal window bench --site <site> migrate -
Clear caches and restart workers (exact commands vary by supervisor):
Terminal window bench --site <site> clear-cachebench restart -
Smoke test: Desk login, open a critical DocType (e.g. Sales Invoice or Application), run one integration action in UAT if available.
-
Monitor: Frappe Error Log, scheduler log, integration-specific logs.
after_migrate hooks: this app registers leekimerp.migrate.after_migrate — if migrate fails, read the traceback before retrying.
Rollback strategy
Section titled “Rollback strategy”Frappe migrations are not always reversible by a simple git revert. Safe practice:
- Restore database from pre-deploy backup if migrate corrupted data (rare but possible with custom patches).
- If migrate succeeded but behavior is wrong, revert code and redeploy; new migrations may require forward-fix patches — coordinate with developers.
Document the git SHA and DB backup id for every production deploy.
Backup before change
Section titled “Backup before change”| Asset | Command / location |
|---|---|
| MariaDB | Your mysqldump / managed backup (point-in-time if available) |
| Files | sites/<site>/private/files, public/files |
Verify restores quarterly (Handover completion checklist).
Cache and session issues
Section titled “Cache and session issues”Symptoms: stale DocType metadata, old JS, permission oddities after deploy.
bench --site <site> clear-cachebench restartBrowser: hard refresh; rule out CDN caching for static assets if customized.
Scheduler and background jobs
Section titled “Scheduler and background jobs”Symptoms: emails not sent, Xero sync stale, cron-dependent reports missing.
- Confirm scheduler enabled for the site (
scheduler_enabledin site config per Frappe docs). - Inspect Scheduled Job Type / Error Log in Desk.
- Review
hooks.pyscheduler_eventsand custom cron entries.
Integration triage
Section titled “Integration triage”- Check Error Log for Python tracebacks in
api/xero.py. - Verify OAuth tokens not expired; re-authorize in Xero settings if your process requires it.
- Webhooks: confirm endpoint reachable from Xero, signature secret matches (Environment matrix).
Stripe
Section titled “Stripe”- Webhook delivery dashboard in Stripe; compare to
stripe_webhooklogs. - Mismatched signing secret is a common failure mode after key rotation.
DocuSign
Section titled “DocuSign”- Envelope status vs internal queue DocTypes; retry dead-letter patterns per implementation.
Singpass
Section titled “Singpass”- Distinguish UAT vs production endpoints and credentials; redirect URI must match registration exactly (Singpass).
Emergency contacts (fill in locally)
Section titled “Emergency contacts (fill in locally)”| System | Escalation | Notes |
|---|---|---|
| Database | ||
| Hosting / VM | ||
| Xero / Stripe / DocuSign | Vendor status pages |