Skip to main content
GKE Hybrid Runner architecture diagram showing the WebSocket connection between Collate SaaS and the Hybrid Runner, and the flow from Argo Workflows through Ingestion Pods to GCP Secret Manager

Requirements

Hardware

A GKE cluster with at least 2 worker nodes. Each node requires:
  • 2 x86 vCPUs
  • 8 GiB Memory
  • 64 GiB Storage
Recommended GCP instance type: t2a-standard-2 or t2d-standard-2 Ingestion workloads can run on Spot VMs to reduce costs, but at least one node must always remain running.
To schedule Collate workloads on dedicated nodes, use Kubernetes taints and tolerations. The Hybrid Runner supports tolerations via custom Helm values.

Software

  • Kubernetes 1.28 or later
  • Helm CLI and kubectl installed locally
  • Argo Workflows 3.7 (installed automatically by the Helm chart)

Installation

The Helm chart bundles all required Kubernetes resources, including:
  • The Hybrid Runner Server pod
  • Argo Workflows (installed as a dependency)
  • A cron job to periodically renew credentials from Collate’s ECR registry
Helm chart repository: hybrid-ingestion-runner-helm-chart

Step 1 — Prerequisites

Before deploying, ensure:
  • Collate has provided you with AWS ECR credentials to pull the required Docker images.
  • You have a GKE cluster running Kubernetes 1.28+.
  • helm and kubectl are installed and configured to target your cluster.

Step 2 — Get the Collate Authentication Token

The Hybrid Runner authenticates to the Collate API using a JWT token from the ingestion bot.
  1. Log in to your Collate instance as an administrator.
  2. Navigate to Settings → Bots and search for ingestion.
  3. Open the IngestionBot entry and copy the OpenMetadata JWT Token.
You can also navigate directly to https://<your-instance>.getcollate.io/bots/ingestion-bot.
This token has no expiration date. Store it securely and treat it as a secret.

Step 3 — Create the Helm values file

Create a values.yaml file with the minimal required configuration:
config:
  agentId: "gke-prod"           # A descriptive name shown in the Collate UI
  authToken: <From step 2>
  serverHost: <mycluster>.getcollate.io

ecrRegistryHelper:
  collateCredentials:
    values:
      accessKeyId: <Provided by Collate>
      secretAccessKey: <Provided by Collate>

installArgoWorkflows: true
If you are unsure about any values, reach out to your Collate support contact.

Step 4 — Deploy

Add the Helm repository and install the chart:
helm repo add collate-hybrid https://open-metadata.github.io/hybrid-ingestion-runner-helm-chart
helm repo update
kubectl create namespace argo-workflows
helm upgrade --install collate-prod collate-hybrid/hybrid-ingestion-runner --values values.yaml
If you had to run the previous command multiple times, it’s possible that the helm hooks are not executed and the registry helper is unable to generate the credentials. In that case, the hybrid runner pod will fail with ImagePullBackOff. You can trigger a manual execution of the credentials helper by running kubectl create job --from=cronjob/ecr-registry-helper manual.

Step 5 — Validating the Setup

Once deployed, verify the runner is connected from the Collate UI:
  1. Go to Settings → Preferences → Ingestion Runners.
  2. Find your runner by the agentId you configured.
  3. Confirm the status shows Active.

Additional Settings

Creating a Service with the Hybrid Runner

  1. Navigate to Settings → Services and click Add New Service.
  2. In the Ingestion Runner dropdown, select your Hybrid Runner.
  3. Before filling in connection details, store any secrets in GCP Secret Manager. By default, the Runner has access to the prefix /collate/hybrid-ingestion-runner.

Secrets management

In the GCP Console, go to Secret Manager → Create Secret. Store the raw secret value (no quotes, no JSON wrapping).

Configuring GCP Secret Manager

Use Workload Identity to grant the ingestion pods access to GCP Secret Manager without managing service account keys. Required IAM role: roles/secretmanager.secretAccessor Update your Helm values to configure the secrets manager and annotate the pod service account:
config:
  secretsManager: "gcp"
  ingestionPods:
    serviceAccount:
      annotations:
        iam.gke.io/gcp-service-account: <gcp-service-account>@<project-id>.iam.gserviceaccount.com
Replace <gcp-service-account> and <project-id> with your GCP values.
You can restrict Secret Manager access to a specific path prefix, for example: projects/<project-id>/secrets/collate/*.

Referencing the secret in the Collate UI

In any password or secret field in the connection form, enter the secret reference using this format:
secret:/collate/hybrid-ingestion-runner/<secret-name>
For example, if your secret is stored at /collate/hybrid-ingestion-runner/my-db-password, enter:
secret:/collate/hybrid-ingestion-runner/my-db-password
Secret references only work for fields that are treated as secrets in the connection form — these are the masked fields with a show/hide toggle icon.