Troubleshooting & Operations
Monitor your scanner deployment, perform maintenance tasks, and resolve common issues.
Monitoring
Health Checks
Check running state of pods:
kubectl get pods -n scannerThe namespace above is an example — use whatever namespace you installed the chart into (helm -n <namespace>).
If the Scanner API is enabled and exposed, you can monitor the health with:
# API health
curl https://scanner.internal.example.com/healthViewing Logs
# Scan Scheduler logs
kubectl logs -n scanner -l app=scan-scheduler -f
# Scan Manager logs
kubectl logs -n scanner -l app=scan-manager -f
# Logs for all non-ephemeral scanner services
kubectl logs -n scanner -l app.kubernetes.io/instance=scanner -fResource Usage
Monitor resource consumption:
kubectl top pods -n scannerCloud-Specific Monitoring
For cloud-specific monitoring options, see your deployment guide:
Prometheus removed in Terraform module v2.0.0. Older deployments bundled a Prometheus stack; this was removed to simplify the module and avoid duplicating functionality already covered by CloudWatch. On AWS, use the CloudWatch Observability addon (enabled by default). On self-managed Kubernetes, wire the scanner Deployments into whatever metrics stack you already run — they expose standard
kubectl logsand the usual container-level metrics.
Maintenance
Updating Scanner Version
Scanner updates are managed through your Terraform configuration. When a new version is available, update your module version and apply:
terraform init -upgrade
terraform applyThe Helm chart performs a rolling update with zero downtime.
You can also update your Terraform module version or pin a specific scanner image version:
module "internal_scanner" {
source = "detectify/internal-scanning/aws"
version = "~> 3.0" # Terraform module version
# Optionally pin scanner image version (defaults to "stable")
# internal_scanning_version = "1.1.0"
}
internal_scanning_versionvs module version.var.internal_scanning_versionis the scanner container image tag — it controls which build ofscan-scheduler/scan-manager/chrome-controllergets deployed. It has nothing to do with the Terraform module’s own version. The module version is set viamodule "..." { version = "..." }and controls the infrastructure shape;internal_scanning_versioncontrols the app running on top of it. The two release independently.
Restarting Components
# Restart all scanner components
kubectl rollout restart deployment -n scanner
# Restart specific component
kubectl rollout restart deployment/scan-scheduler -n scannerCommon Issues
Helm install validation errors
Chart 2.0.0 validates at install time that either inline credentials or a matching existing*Secret is set for each of the two Secrets. You’ll see one of these messages when something is missing:
secrets.licenseKey is required unless secrets.existingConfigSecret is set.secrets.connectorApiKey is required unless secrets.existingConfigSecret is set.secrets.registry.username is required unless secrets.existingRegistrySecret is set.secrets.registry.password is required unless secrets.existingRegistrySecret is set.
Fix: either set the inline value under secrets.*, or point the chart at a pre-existing Secret using secrets.existingConfigSecret / secrets.existingRegistrySecret. See Secrets Management for the expected Secret schemas and worked recipes.
The two Secrets are independent — you can provide one inline and one via existing*Secret.
helm install fails into the default namespace
You see:
Error: internal-scanning-agent refuses to install into the `default` namespace.Cause: chart 2.0.0 refuses to install into default to prevent accidental co-tenanting with unrelated workloads. Every resource targets .Release.Namespace, so mixing with default would give the scanner labels and RBAC rights over anything else in there.
Fix: pass an explicit -n <namespace> (and --create-namespace on first install):
helm install detectify-scanner detectify/internal-scanning-agent \
--version '~> 2.0' \
-n scanner \
--create-namespace \
-f my-values.yamlUpgrading from Chart 1.x
If helm upgrade fails with the validation errors above or complains about unknown values (namespace.name, config.licenseKey, config.imagePullSecret, registry.imagePullSecrets), you’re on a 1.x values file against the 2.0 chart. Rewrite your values using the 1.x → 2.0 migration guide.
Scanner Not Connecting to Detectify
Symptoms: Scanner shows as disconnected in the Detectify UI.
Steps to diagnose:
-
Verify outbound internet access:
# Attach a debug container to a scan scheduler pod and curl from it kubectl debug -it scan-scheduler-YOUR_POD_ID --image=curlimages/curl --target=scheduler -n scanner -- curl -I https://connector.detectify.com/status -
Check API token is configured correctly:
kubectl get secret -n scanner scanner-config -o yaml # If using BYO-Secret, substitute the name you set in secrets.existingConfigSecret -
Check scan-scheduler logs for connection errors:
kubectl logs -n scanner -l app=scan-scheduler --tail=100
Scans Failing
Symptoms: Scans start but fail to complete or report errors.
Steps to diagnose:
-
Check scan manager logs for errors:
kubectl logs -n scanner -l app=scan-manager --tail=100 -
Verify network connectivity to target application:
kubectl debug -it scan-manager-YOUR_POD_ID --image=curlimages/curl --target=manager -n scanner -- curl https://target-app.internal -
Check if scan-worker pods are being created:
kubectl get pods -n scanner -w
Pods Not Starting
Symptoms: Pods stuck in Pending or CrashLoopBackOff state.
Steps to diagnose:
-
Check pod status and events:
kubectl describe pod -n scanner <pod-name> -
View pod logs:
kubectl logs -n scanner <pod-name> -
Check node resources:
kubectl top nodes
Common causes:
- Image pull errors (check registry credentials)
- Configuration errors (check secrets and configmaps)
- Missing keys in a bring-your-own Secret — see Secret schemas
- Insufficient cluster resources (nodes need to scale up)
High Resource Usage / OOMKilled
Symptoms: Pods being killed due to memory limits, slow performance.
Steps to diagnose:
-
Monitor resource consumption:
kubectl top pods -n scanner -
Check for OOMKilled events:
kubectl get events -n scanner --field-selector reason=OOMKilled
Solution: Increase memory limits in your configuration or reduce concurrent scans.
Image Pull Errors
Symptoms: Pods stuck with ImagePullBackOff or ErrImagePull status.
Steps to diagnose:
-
Check pod events for details:
kubectl describe pod -n scanner <pod-name> -
Verify container registry credentials are configured:
kubectl get secret -n scanner detectify-registry -o yaml # If using BYO-Secret, substitute the name you set in secrets.existingRegistrySecret
Solution: Verify your Docker credentials from the Detectify UI are correctly configured. Contact Detectify support if you’re unable to pull images.
Getting Help
If you’re unable to resolve an issue:
-
Collect diagnostic information:
kubectl get pods -n scanner -o wide kubectl describe pods -n scanner kubectl logs -n scanner -l app.kubernetes.io/part-of=scanner --tail=200 kubectl get events -n scanner --sort-by='.lastTimestamp' -
Contact Detectify support with the diagnostic output and a description of the issue.