Skip to main content

Overview

The Hybrid Runner acts as a proxy between the Collate Server and the workflow executor running in your cluster (Argo Workflows or the built-in Simple Kubernetes executor). Collate doesn’t store or access your data source secrets. Those are managed entirely within your infrastructure and retrieved at runtime by the Runner.

How the Runner Fits Into Collate

  • WebSocket Outbound Connection Model: The Hybrid Runner establishes an outbound WebSocket connection using transport layer security (TLS) to your Collate instance. Collate doesn’t initiate any inbound connections to your cluster.
  • Secrets Resolution at Runtime: The Runner resolves connector credentials from your configured secrets store at ingestion time. They’re never stored in or transmitted to Collate.
  • No Inbound Access from Collate: Your cluster doesn’t need to accept any inbound connections. Port 443 egress to *.getcollate.io is required for the Hybrid Runner’s outbound WebSocket control-plane connection to Collate. Your cluster may also need egress to your container registry, cloud secrets manager, and data source systems depending on your configuration.
Note: The Hybrid Runner runs inside your own Kubernetes cluster. A team member familiar with Kubernetes and Helm can get it up and running in a few minutes.

Architecture Overview

Hybrid Runner architecture diagram showing the WebSocket connection between Collate SaaS and the Hybrid Runner, and the flow from Argo Workflows through ingestion pods to Kubernetes Secrets

How the Hybrid Runner Works

The Hybrid Runner connects to Collate over an outbound WebSocket and executes ingestion jobs inside your cluster. Credentials never leave your environment.
  • WebSocket Proxy Model The Hybrid Runner maintains a persistent WebSocket connection to Collate. When you trigger an ingestion job in the Collate UI, the server sends the job configuration to the Runner over this connection.
  • Outbound-Only TLS Connection All traffic is outbound from your cluster. The Runner connects to your Collate instance on port 443 using transport layer security (TLS). Inbound firewall rules aren’t required.
  • Secrets Retrieved at Runtime When an ingestion pod starts, it resolves credentials from your configured secrets store using the secret path specified in the Collate UI connection form. Collate never stores credentials. By default, native Kubernetes Secrets are used—the Helm chart automatically creates the necessary RBAC permissions. You can optionally integrate with a cloud-native secrets manager (AWS Secrets Manager, GCP Secret Manager, or Azure Key Vault) depending on your environment.

Authentication Model

  • IngestionBot JWT Token The Hybrid Runner authenticates to the Collate API using a JSON Web Token (JWT) from the IngestionBot service account. Configure this token as config.authToken in the Helm values file.
  • Token Scope and Lifetime The IngestionBot JWT has no expiration date. Store it securely and treat it as a secret. To know how to obtain JWT, see Obtain Collate JSON Web Token (JWT).

What Is Argo Workflows?

Argo Workflows is a Kubernetes-native workflow orchestration engine. The Hybrid Runner uses Argo to execute and manage individual ingestion jobs as Kubernetes pods.
  • Role in the Ingestion Pipeline When the Runner receives a job from Collate, it submits an Argo Workflow. Argo schedules and monitors the ingestion pod, handles retries, and reports completion status back to the Runner.
  • Installation Argo Workflows is installed when installArgoWorkflows: true is set in your Helm values. The chart default is false — without Argo, the Runner falls back to the Simple Kubernetes executor to run ingestion pods. In multi-instance deployments, a single Argo installation can be shared — set installArgoWorkflows: false on additional Runner instances. See Advanced Configuration. For more information about Argo Workflows, see What is Argo Workflows?.