Skip to main content
Production Ready

Drawbridge Orchestration Platform

Kubernetes-pattern controller that manages your AI worker fleet declaratively. Self-healing, auto-scaling, credit-aware.

Without Drawbridge

  • Manual worker scaling — SSH in, restart processes
  • No visibility — which accounts are healthy? Which are exhausted?
  • Single point of failure — one crash takes everything down
  • Credit exhaustion kills workloads — no graceful handling
  • Deployment requires downtime — stop, update, restart

With Drawbridge

  • Declarative fleet — define desired state, controller maintains it
  • Full visibility — 3 dashboards, 15 alerts, real-time events
  • Self-healing — crashed pods replaced within 45 seconds
  • Credit-aware scaling — scale to zero when exhausted, restore automatically
  • Zero-downtime deploys — graceful drain, replace, resume

Controller Deep-Dive

Kubernetes-inspired reconciler pattern. Declare what you want, the system keeps it that way.

30s Reconcile Loop

Continuously compares desired vs actual state. Drift detected and corrected automatically.

Auto-Scaling Engine

Credit-based scale to zero, queue-based burst, idle timeout shrink. No manual intervention.

Service Discovery

ValKey-based registry with 5s heartbeat, Pub/Sub events, and real-time topology awareness.

Leader Election

Distributed locks via ValKey SET NX. Multi-instance safety — only one controller acts.

Docker + Kubernetes

Same controller code drives both Docker API and Kubernetes client-go. Choose your runtime.

Health-Based Replacement

Continuous health checks. 3 consecutive failures trigger automatic pod replacement.

Worker Architecture

Standalone HTTP binary. Process pool per account. Full lifecycle management.

Standalone Binary

kiro-worker runs as independent HTTP server with /prompt, /health, /slots endpoints

Container-Native

Ships as Docker image. Controller creates and destroys containers dynamically

Health Reporting

Continuous health endpoint. Reports healthy, degraded, draining, or unhealthy state

Graceful Drain

POST /drain triggers graceful shutdown. Completes in-flight requests before exit

5 Deployment Modes

Same codebase, same config format. Scale from laptop to production Kubernetes.

Single Binary

One process, local workers. For development and testing.

make build && ./bin/drawbridge --config config.yaml

Docker Cluster

N nodes behind nginx LB, shared ValKey.

make prepare-docker && make up-multi

Docker Controller

Dynamic worker containers via Docker API.

make up-workers

K8s Controller

Helm chart, dynamic Pods, RBAC, HPA.

helm install drawbridge ./helm/drawbridge --set controller.enabled=true

K8s Static

Pre-defined Worker Deployments, GitOps-friendly.

helm install drawbridge ./helm/drawbridge --set workers.enabled=true

Deploy the Platform

Full Helm chart. RBAC included. Production-ready in one command.