Troubleshooting & Operations

Monitor your scanner deployment, perform maintenance tasks, and resolve common issues.

Monitoring

Health Checks

Check running state of pods:


kubectl get pods -n scanner

The namespace above is an example — use whatever namespace you installed the chart into (helm -n <namespace>).

If the Scanner API is enabled and exposed, you can monitor the health with:


# API health
curl https://scanner.internal.example.com/health

Viewing Logs


# Scan Scheduler logs
kubectl logs -n scanner -l app=scan-scheduler -f
 
# Scan Manager logs
kubectl logs -n scanner -l app=scan-manager -f
 
# Logs for all non-ephemeral scanner services
kubectl logs -n scanner -l app.kubernetes.io/instance=scanner -f

Resource Usage

Monitor resource consumption:


kubectl top pods -n scanner

Cloud-Specific Monitoring

For cloud-specific monitoring options, see your deployment guide:

AWS CloudWatch

Prometheus removed in Terraform module v2.0.0. Older deployments bundled a Prometheus stack; this was removed to simplify the module and avoid duplicating functionality already covered by CloudWatch. On AWS, use the CloudWatch Observability addon (enabled by default). On self-managed Kubernetes, wire the scanner Deployments into whatever metrics stack you already run — they expose standard kubectl logs and the usual container-level metrics.

Maintenance

Updating Scanner Version

Scanner updates are managed through your Terraform configuration. When a new version is available, update your module version and apply:


terraform init -upgrade
terraform apply

The Helm chart performs a rolling update with zero downtime.

You can also update your Terraform module version or pin a specific scanner image version:


module "internal_scanner" {
  source  = "detectify/internal-scanning/aws"
  version = "~> 3.0"  # Terraform module version
 
  # Optionally pin scanner image version (defaults to "stable")
  # internal_scanning_version = "1.1.0"
}

internal_scanning_version vs module version. var.internal_scanning_version is the scanner container image tag — it controls which build of scan-scheduler / scan-manager / chrome-controller gets deployed. It has nothing to do with the Terraform module’s own version. The module version is set via module "..." { version = "..." } and controls the infrastructure shape; internal_scanning_version controls the app running on top of it. The two release independently.

Restarting Components


# Restart all scanner components
kubectl rollout restart deployment -n scanner
 
# Restart specific component
kubectl rollout restart deployment/scan-scheduler -n scanner

Common Issues

Helm install validation errors

Chart 2.0.0 validates at install time that either inline credentials or a matching existing*Secret is set for each of the two Secrets. You’ll see one of these messages when something is missing:

secrets.licenseKey is required unless secrets.existingConfigSecret is set.
secrets.connectorApiKey is required unless secrets.existingConfigSecret is set.
secrets.registry.username is required unless secrets.existingRegistrySecret is set.
secrets.registry.password is required unless secrets.existingRegistrySecret is set.

Fix: either set the inline value under secrets.*, or point the chart at a pre-existing Secret using secrets.existingConfigSecret / secrets.existingRegistrySecret. See Secrets Management for the expected Secret schemas and worked recipes.

The two Secrets are independent — you can provide one inline and one via existing*Secret.

`helm install` fails into the `default` namespace

You see:


Error: internal-scanning-agent refuses to install into the `default` namespace.

Cause: chart 2.0.0 refuses to install into default to prevent accidental co-tenanting with unrelated workloads. Every resource targets .Release.Namespace, so mixing with default would give the scanner labels and RBAC rights over anything else in there.

Fix: pass an explicit -n <namespace> (and --create-namespace on first install):


helm install detectify-scanner detectify/internal-scanning-agent \
  --version '~> 2.0' \
  -n scanner \
  --create-namespace \
  -f my-values.yaml

Upgrading from Chart 1.x

If helm upgrade fails with the validation errors above or complains about unknown values (namespace.name, config.licenseKey, config.imagePullSecret, registry.imagePullSecrets), you’re on a 1.x values file against the 2.0 chart. Rewrite your values using the 1.x → 2.0 migration guide.

Scanner Not Connecting to Detectify

Symptoms: Scanner shows as disconnected in the Detectify UI.

Steps to diagnose:

Verify outbound internet access:


# Attach a debug container to a scan scheduler pod and curl from it
kubectl debug -it scan-scheduler-YOUR_POD_ID --image=curlimages/curl --target=scheduler -n scanner -- curl -I https://connector.detectify.com/status

Check API token is configured correctly:


kubectl get secret -n scanner scanner-config -o yaml
# If using BYO-Secret, substitute the name you set in secrets.existingConfigSecret

Check scan-scheduler logs for connection errors:


kubectl logs -n scanner -l app=scan-scheduler --tail=100

Scans Failing

Symptoms: Scans start but fail to complete or report errors.

Steps to diagnose:

Check scan manager logs for errors:


kubectl logs -n scanner -l app=scan-manager --tail=100

Verify network connectivity to target application:


kubectl debug -it scan-manager-YOUR_POD_ID --image=curlimages/curl --target=manager -n scanner -- curl https://target-app.internal

Check if scan-worker pods are being created:
```
kubectl get pods -n scanner -w
```

Pods Not Starting

Symptoms: Pods stuck in Pending or CrashLoopBackOff state.

Steps to diagnose:

Check pod status and events:


kubectl describe pod -n scanner <pod-name>

View pod logs:
```
kubectl logs -n scanner <pod-name>
```
Check node resources:
```
kubectl top nodes
```

Common causes:

Image pull errors (check registry credentials)
Configuration errors (check secrets and configmaps)
Missing keys in a bring-your-own Secret — see Secret schemas
Insufficient cluster resources (nodes need to scale up)

High Resource Usage / OOMKilled

Symptoms: Pods being killed due to memory limits, slow performance.

Steps to diagnose:

Monitor resource consumption:
```
kubectl top pods -n scanner
```

Check for OOMKilled events:


kubectl get events -n scanner --field-selector reason=OOMKilled

Solution: Increase memory limits in your configuration or reduce concurrent scans.

Image Pull Errors

Symptoms: Pods stuck with ImagePullBackOff or ErrImagePull status.

Steps to diagnose:

Check pod events for details:


kubectl describe pod -n scanner <pod-name>

Verify container registry credentials are configured:


kubectl get secret -n scanner detectify-registry -o yaml
# If using BYO-Secret, substitute the name you set in secrets.existingRegistrySecret

Solution: Verify your Docker credentials from the Detectify UI are correctly configured. Contact Detectify support if you’re unable to pull images.

Getting Help

If you’re unable to resolve an issue:

Collect diagnostic information:


kubectl get pods -n scanner -o wide
kubectl describe pods -n scanner
kubectl logs -n scanner -l app.kubernetes.io/part-of=scanner --tail=200
kubectl get events -n scanner --sort-by='.lastTimestamp'

Contact Detectify support with the diagnostic output and a description of the issue.

Troubleshooting & Operations

Monitoring

Health Checks

Viewing Logs

Resource Usage

Cloud-Specific Monitoring

Maintenance

Updating Scanner Version

Restarting Components

Common Issues

Helm install validation errors

helm install fails into the default namespace

Upgrading from Chart 1.x

Scanner Not Connecting to Detectify

Scans Failing

Pods Not Starting

High Resource Usage / OOMKilled

Image Pull Errors

Getting Help

`helm install` fails into the `default` namespace