Drawbridge Orchestration Platform
Kubernetes-pattern controller that manages your AI worker fleet declaratively. Self-healing, auto-scaling, credit-aware.
Without Drawbridge
- Manual worker scaling — SSH in, restart processes
- No visibility — which accounts are healthy? Which are exhausted?
- Single point of failure — one crash takes everything down
- Credit exhaustion kills workloads — no graceful handling
- Deployment requires downtime — stop, update, restart
With Drawbridge
- Declarative fleet — define desired state, controller maintains it
- Full visibility — 3 dashboards, 15 alerts, real-time events
- Self-healing — crashed pods replaced within 45 seconds
- Credit-aware scaling — scale to zero when exhausted, restore automatically
- Zero-downtime deploys — graceful drain, replace, resume
Controller Deep-Dive
Kubernetes-inspired reconciler pattern. Declare what you want, the system keeps it that way.
30s Reconcile Loop
Continuously compares desired vs actual state. Drift detected and corrected automatically.
Auto-Scaling Engine
Credit-based scale to zero, queue-based burst, idle timeout shrink. No manual intervention.
Service Discovery
ValKey-based registry with 5s heartbeat, Pub/Sub events, and real-time topology awareness.
Leader Election
Distributed locks via ValKey SET NX. Multi-instance safety — only one controller acts.
Docker + Kubernetes
Same controller code drives both Docker API and Kubernetes client-go. Choose your runtime.
Health-Based Replacement
Continuous health checks. 3 consecutive failures trigger automatic pod replacement.
Worker Architecture
Standalone HTTP binary. Process pool per account. Full lifecycle management.
Standalone Binary
kiro-worker runs as independent HTTP server with /prompt, /health, /slots endpoints
Container-Native
Ships as Docker image. Controller creates and destroys containers dynamically
Health Reporting
Continuous health endpoint. Reports healthy, degraded, draining, or unhealthy state
Graceful Drain
POST /drain triggers graceful shutdown. Completes in-flight requests before exit
5 Deployment Modes
Same codebase, same config format. Scale from laptop to production Kubernetes.
Single Binary
One process, local workers. For development and testing.
make build && ./bin/drawbridge --config config.yamlDocker Cluster
N nodes behind nginx LB, shared ValKey.
make prepare-docker && make up-multiDocker Controller
Dynamic worker containers via Docker API.
make up-workersK8s Controller
Helm chart, dynamic Pods, RBAC, HPA.
helm install drawbridge ./helm/drawbridge --set controller.enabled=trueK8s Static
Pre-defined Worker Deployments, GitOps-friendly.
helm install drawbridge ./helm/drawbridge --set workers.enabled=trueDeploy the Platform
Full Helm chart. RBAC included. Production-ready in one command.