Scaling & Capacity Planning
Plan and configure resources based on your concurrent scanning requirements.
Understanding Resource Units
Kubernetes uses specific units for CPU and memory:
| Unit | Meaning | Equivalent |
|---|---|---|
| m (millicores) | CPU measurement | 1000m = 1 CPU core |
| Mi (mebibytes) | Memory measurement | 1 Mi ≈ 1 MB |
| Gi (gibibytes) | Memory measurement | 1 Gi ≈ 1 GB |
For example, 200m CPU means 0.2 CPU cores (20% of one core), and 256Mi means approximately 256 MB of memory.
How Scanning Works
When a scan is triggered:
- Scan Scheduler receives the request and queues it in Redis
- Scan Manager picks up the job and creates an ephemeral Scan Worker pod
- For JavaScript-heavy applications, a Chrome Instance pod is also created
- Once the scan completes, the ephemeral pods are automatically cleaned up
Each scan-manager pod can handle 5 concurrent scans by default.
Key formula: scan_manager_replicas × 5 = max concurrent scans
Component Resources
Static Components (Always Running)
These components run continuously regardless of scan activity:
| Component | CPU Request | Memory Request | CPU Limit | Memory Limit | Default Replicas |
|---|---|---|---|---|---|
| Scan Scheduler | 200m | 256Mi | 1000m | 1Gi | 2 |
| Scan Manager | 200m | 256Mi | 1000m | 1Gi | 1 |
| Chrome Controller | 200m | 512Mi | 1000m | 2Gi | 1 |
| Redis | 100m | 128Mi | 500m | 512Mi | 1 |
Dynamic Components (Per Active Scan)
These pods are created on-demand for each running scan:
| Component | CPU Request | Memory Request | CPU Limit | Memory Limit | Notes |
|---|---|---|---|---|---|
| Scan Worker | 100m | 256Mi | 1000m | 1Gi | 1 pod per scan |
| Chrome Instance | 100m | 256Mi | 1000m | 1Gi | + 2Gi /dev/shm |
Capacity Planning Table
Use this table to plan resources based on your concurrent scan requirements:
| Concurrent Scans | Scan Manager Replicas | Scan Scheduler Replicas | Chrome Controller | Total CPU | Total Memory |
|---|---|---|---|---|---|
| 5 | 1 | 1 | 1 | ~2 vCPU | ~8 Gi |
| 10 | 2 | 2 | 1 | ~4 vCPU | ~16 Gi |
| 20 | 4 | 2 | 1 | ~8 vCPU | ~32 Gi |
| 50 | 10 | 3 | 2 | ~16 vCPU | ~64 Gi |
| 100 | 20 | 3 | 2 | ~32 vCPU | ~128 Gi |
Total CPU/Memory = aggregate across all nodes. Your cluster’s node autoscaler distributes this across multiple nodes automatically.
Calculation formula:
- Scan Manager replicas =
ceil(concurrent_scans / 5) - Per scan memory (peak) = 1Gi (worker) + 3Gi (Chrome + shm) = 4Gi
- Per scan CPU (peak) = 1 core (worker) + 1 core (Chrome) = 2 cores
No hard limit: There is no maximum concurrent scan limit enforced by the software. Scale as high as your cluster resources allow.
Node Autoscaling
Most managed Kubernetes services (EKS, AKS, GKE) support automatic node scaling. When enabled:
- Nodes are created on demand - When pods are scheduled, the autoscaler provisions nodes
- Right-sized instances - Selects appropriate instance types based on pod resource requests
- Horizontal scaling - Creates multiple smaller nodes rather than one large node
- Scale to zero - Terminates unused nodes when scans complete
You don’t need to pre-provision large nodes. The autoscaler handles it automatically.
Example: For 20 concurrent scans needing ~8 vCPU / ~32 Gi total, the autoscaler might create:
- 4× small nodes (2 vCPU / 8 Gi each), or
- 2× medium nodes (4 vCPU / 16 Gi each)
Configuring Replicas
Static Configuration
Set replica counts directly in your Terraform configuration. Example for 20 concurrent scans:
scan_scheduler_replicas = 2
scan_manager_replicas = 4
chrome_controller_replicas = 1Enabling Autoscaling
For dynamic workloads, enable Horizontal Pod Autoscaler (HPA) to automatically scale based on CPU utilization:
scan_scheduler_autoscaling = {
enabled = true
min_replicas = 2
max_replicas = 10
target_cpu_utilization_percentage = 70
}
scan_manager_autoscaling = {
enabled = true
min_replicas = 1
max_replicas = 20
target_cpu_utilization_percentage = 80
}With HPA enabled, the system automatically scales based on CPU utilization. The default maximum of 20 scan-manager replicas supports 100 concurrent scans, but you can increase max_replicas to scale further—the only limit is your cluster’s available resources.
Cost Considerations
Costs vary by cloud provider. See your deployment guide for provider-specific estimates:
- AWS Costs
- Azure Costs (coming soon)
- GCP Costs (coming soon)
General guidance:
- You only pay for resources when scans are running (with node autoscaling)
- More concurrent scans = higher compute costs
- Static components have a baseline cost regardless of scan activity
Next Steps
- Troubleshooting - Monitoring, maintenance, and common issues