Tenant isolation
Every Prisma query is scoped by `shop` at the schema level. A custom CI lint rule rejects mutations that read by `id` without verifying `resource.shop === session.shop`. Cross-tenant data exposure is structurally impossible — not just unlikely.
- Schema-level shop scoping on every model that holds tenant data
- Tenancy-check CI rule with override-by-comment for verified-safe sites
- RBAC checks cascade: solo-merchant bypass · plan inheritance · per-permission
Fine-grained RBAC
145 named permissions across 17 engines. 71% of mutating routes wired to a specific permission. Shadow-mode enforcement lets agencies adjust role assignments before flipping live (per-engine env flag).
- 145 permissions across 17 engine namespaces
- Shadow-mode default, env-flag-controlled enforcement (RBAC_ENFORCE_<ENGINE>=1)
- Per-day metering surfaces dead permissions in /app/rbac-admin
- Solo-merchant bypass cached at 60s TTL
Encryption
AES-256 at rest (KMS-managed keys), TLS 1.3 in transit. Secrets live in environment variables only — never in DB columns, never in logs. Merchants cannot store API keys with us; we resolve them per-request from platform secrets.
- AES-256-GCM at rest, KMS-managed keys
- TLS 1.3 in transit, modern cipher suites only
- Merchant secrets stored encrypted per-shop in AppSettings — never plaintext, never logged
- Encryption helper centralized; new secret persistence requires architecture review
OAuth security
Hybrid credential model — merchants can bring their own OAuth Client ID/Secret for full tenant isolation, or use the platform-default app for one-click onboarding. Refresh tokens are always stored encrypted, scoped to minimum necessary, and rotated by an hourly worker that swaps refresh→access via the public OAuth endpoint, never via SDK calls that might leak. Token-stale errors surface as a clean reauth flow, not a stack trace.
- Per-platform OAuth flows with HMAC-bound state (CSRF-safe)
- Hybrid credentials: merchant-first lookup, platform-default fallback
- Conversion pixels (Meta, GA4, TikTok) are strictly per-merchant — no cross-tenant attribution risk
- Refresh tokens encrypted, scope-minimized
- Hourly rotation worker, fail-closed on rotation error
- Token-stale flagged as reauth_required, not 500
Audit logging
Every autonomous action — every blog generation, segment creation, redirect, paid bid mutation — writes a plain-language audit row to engine-audit. Searchable by engine, action type, resource, severity, and time window. The audit log is the merchant's evidence layer.
- engineAudit() called on every autonomous action
- Plain-language reason field — 'Created segment because 47 customers matched lifecycle=at_risk'
- 365-day default retention; configurable per-tier
- Parent-child correlation via traceId
Rate limits & cost controls
Helper-level throttle. callAI() and shopifyGraphQL() self-throttle — routes can't bypass. Per-shop AI budget caps with 80% and 100% alerts. Public endpoints rate-limited per-shop AND per-IP. Workers gate external API calls through token-bucket throttle.
- Per-shop sliding-window rate limits, plan-tier-aware
- Monthly AI budget caps with multi-channel alerting
- Per-IP rate limits on public storefront APIs
- Worker-side throttle for Shopify, GSC, WhatsApp, Replicate
Automation safety
Dry-run by default. 7-day mandatory dry-run period before any paid-spend automation can flip live. Snapshot-and-undo on every action. Kill-switch flips an entire shop's automation off in one click. Approval queues for agency-managed shops route high-blast-radius actions through human review.
- Every action exports snapshot() + handler() + undoHandler()
- 7-day mandatory dry-run for ads:bid:adjust and similar paid-spend actions
- Kill-switch (AppSettings.automationsPaused) and freeze windows (BFCM-safe)
- Approval queues with /app/automations/approvals UI
Backup & DR
Daily encrypted snapshots with 30-day retention. Multi-AZ Postgres with synchronous replica. RTO 15 minutes, RPO 1 hour committed in enterprise SLA. Restore drills run quarterly.
- Daily encrypted snapshots, 30-day default retention
- Multi-AZ deployment, synchronous Postgres replica
- RTO 15 minutes, RPO 1 hour (Enterprise SLA)
- Quarterly restore drills with documented success criteria
Compliance roadmap
SOC 2 Type II audit in progress (target completion 2026 Q2). GDPR-ready: DPA available on request, data-residency options for EU. ISO 27001 on roadmap (2026 Q4). HIPAA available for relevant verticals on request.
- SOC 2 Type II — in progress, audit firm engaged
- GDPR — DPA available, EU data residency option
- ISO 27001 — 2026 Q4 target
- HIPAA — available on request
- GDPRCompliantDPA available, EU data residency option
- HIPAA-readyAvailableOn request for relevant verticals
- PCI DSS L1Via ShopifyInherits Shopify's L1 certification
- SOC 2 Type IIIn progressAudit firm engaged · target 2026 Q2
- ISO 270012026 Q4Scoped, gap analysis underway
Reliability engineering
Concurrency caps on every BullMQ worker. Graceful degradation: fail-open on observability outages, fail-closed on permission checks. Worker-side throttle prevents runaway jobs from hammering external APIs.
- Per-worker concurrency cap (default 3, tunable)
- Fail-open observability, fail-closed permissions
- Per-platform token-bucket throttle on external API calls
- Dead-letter handling on persistent failures
Secret management
Platform-level keys only. Merchants never see API keys. Provider routing happens in callAI(); fallback chain transparent to caller. Rotation runbook documented; secrets rotate without code changes.
- Application-level keys, environment-only
- No merchant-facing key storage
- Provider routing + fallback in callAI()
- Rotation without code change