Documentation Index
Fetch the complete documentation index at: https://docs.getcollate.io/llms.txt
Use this file to discover all available pages before exploring further.
Troubleshooting
Use this section to diagnose the most common issues after deployment. For each symptom, work through the steps in order.Could Not Get the Secret Value or Forbidden
Cause 1 — Missing IAM or Workload Identity
Your ingestion service account isn’t bound to the correct IAM (Identity and Access Management) role or Workload Identity. Do the following checks:- Verify the annotation on the
ingestionservice account. - Confirm the cloud IAM binding is in place for your provider.
Cause 2 — Secret Name Mismatch
The name you entered in the Collate UI doesn’t match the name under which the secret is stored in your secrets store. When you entersecret:my-db-password in the Collate UI, the runner strips the secret: prefix and looks up my-db-password directly in your secrets store. If the secret was stored under a different name — for example, with a path prefix like /collate/hybrid-ingestion-runner/my-db-password — the lookup fails because the runner is searching for my-db-password, not the full path.
Do the following checks:
- Open your secrets store (AWS Secrets Manager, Azure Key Vault, or GCP Secret Manager) and confirm the exact name the secret is stored under.
- In the Collate UI connection form, verify the masked field contains
secret:<secret-name>, where<secret-name>matches the name in your secrets store character for character. - Check for typos, extra slashes, or path segments that aren’t part of the stored secret name.
Cause 3 — Missing secretsManager Helm Value
To use a cloud secrets manager, set config.secretsManager explicitly in your
values.yaml. Without it, the Runner falls back to Kubernetes Secrets and can’t resolve cloud secrets manager paths.
Do the following steps:
- Open your
values.yaml. - Confirm
config.secretsManageris set to the correct value for your provider (managed-aws,gcp, ormanaged-azure-kv). - Run
helm upgradeto apply the change.
Runner Shows as Inactive in the Collate UI
-
Check that the
authTokeninvalues.yamlis the correct and unexpired JWT from the IngestionBot. -
Verify outbound TLS (port 443) is allowed from your cluster to
<your-instance>.getcollate.io. -
Confirm the pod is running:
kubectl get pods. -
Check the Runner pod logs for connection or authentication errors:
ImagePullBackOff on the Runner Pod
The ECR credentials cron job may not have run yet. Trigger it manually:
Ingestion Pod Not Found — Diagnostics Unavailable
Issue: The ingestion job fails and the exit handler reports:Errored or OOMKilled state. An absent pod
or ContainerStatusUnknown state indicates the pod was removed externally,
typically by one of the following:
- Cluster autoscaling scaled down the node running the ingestion pod.
- A pod cleanup policy or TTL controller removed the pod.
- The node was rotated or replaced during the ingestion run.
-
Check the pod state immediately after the next failure:
-
Review cluster events around the time of failure:
-
Once identified, work with your infrastructure team to address the cause —
for example, configuring scale-down protection for ingestion workloads or
excluding the
argo-workflowsnamespace from pod cleanup policies.
If the pod is absent, it was removed by an external process before Argo’s configured TTL. Check your
ARGO_SECONDS_AFTER_COMPLETION_TTL setting to confirm the expected retention window.