Skip to main content

GKE Architecture Diagram

GKE Hybrid Runner architecture diagram showing the WebSocket connection
between Collate SaaS and the Hybrid Runner, and the flow from Argo Workflows
through Ingestion Pods to GCP Secret Manager

Prerequisites

Before you deploy on GKE, confirm the following:
  • ECR credentials: You’ve received AWS ECR (Elastic Container Registry) credentials (accessKeyId and secretAccessKey) from Collate to pull the required Docker images.
  • Kubernetes cluster: You have a GKE (Google Kubernetes Engine) cluster running Kubernetes 1.28+.
  • Tooling: helm and kubectl are installed and configured to target your cluster.
  • Secrets store: A secrets store is configured for your cluster. Store your connector credentials in the secrets store — they can’t be entered as plain text in the Collate UI. By default, Kubernetes Secrets are used and the Helm chart installs all required RBAC (Role-Based Access Control) automatically. To use GCP Secret Manager, see Secrets Management.
  • Authentication token: You have a JWT (JSON Web Token) from the IngestionBot. For more information about how to get your Collate JWT, see Obtain Collate JSON Web Token (JWT).
    Note: This token doesn’t expire. Store it securely and treat it as a secret.

Installation Procedure

The Helm chart bundles all required Kubernetes resources:
  • The Hybrid Runner Server pod
  • Argo Workflows (installed as a dependency)
  • A cron job to periodically renew credentials from Collate’s ECR registry
Helm chart repository: hybrid-ingestion-runner-helm-chart

Step 1: Add the Helm Repository

Add the Collate Helm repository and fetch the latest chart index:
helm repo add collate-hybrid https://open-metadata.github.io/hybrid-ingestion-runner-helm-chart
helm repo update

Step 2: Create the Helm Values File

Create a values.yaml file with the minimum required configuration. Replace the placeholder values with the credentials provided by Collate and your instance details:
config:
  agentId: "gke-prod"           # Descriptive name shown in the Collate UI
  authToken: <Collate IngestionBot JWT token>
  serverHost: <your-instance>.getcollate.io

ecrRegistryHelper:
  collateCredentials:
    values:
      accessKeyId: <Provided by Collate>
      secretAccessKey: <Provided by Collate>

installArgoWorkflows: true

Step 3: Create the Argo Workflows Namespace

Create a dedicated namespace for Argo Workflows before running the Helm install:
kubectl create namespace argo-workflows

Step 4: Deploy the Helm Chart

Install the Hybrid Runner chart using the values file you created in Step 2:
helm upgrade --install collate-prod collate-hybrid/hybrid-ingestion-runner --namespace argo-workflows --values values.yaml
Note: If you run the above command more than once, Helm hooks may not re-execute and the ECR registry helper may fail to generate credentials. If the Runner pod shows ImagePullBackOff, trigger the credentials helper manually:
kubectl create job --from=cronjob/ecr-registry-helper manual

Verify the Deployment

Once the Helm chart is deployed, run the following checks to confirm everything is working correctly.
  • Check Pod Running Status: Confirm all pods are running before proceeding:
    kubectl get pods
    
  • Confirm Runner Shows Active in the Collate UI:
    1. Navigate to Settings > Preferences > Ingestion Runners.
    2. Find your runner by its agentId (for example, gke-prod).
    3. Confirm the status shows Active.
  • Verify Argo Workflows Pods Are Running: Confirm the Argo Workflows controller and server pods are healthy:
    kubectl get pods -n argo-workflows -l app.kubernetes.io/part-of=argo-workflows
    
  • Confirm ECR Cron Job Health: Confirm the ECR credentials cron job is present and scheduled:
    kubectl get cronjob ecr-registry-helper
    

Creating a Service With the Hybrid Runner

Use this procedure to add a new service and connect it to your Hybrid Runner.
  1. Navigate to Settings > Services and select the service type you want to add.
  2. Select the service connector and select Next.
  3. Enter the Service Name and a Description (optional), then click Next.
  4. In the Ingestion Runner drop-down, select your Hybrid Runner.
  5. Fill in the connection details. For any credential field that shows a hide/show toggle, enter the secret reference path from your secrets store instead of the actual value:
      secret:<secret-name>
    
    Important: Your credentials must be stored in your secrets store before referencing them here. See Secrets Management.

Secrets Management

The Hybrid Runner supports two approaches for managing connector credentials. Kubernetes Secrets is the default and requires no additional configuration.

Option 1: Kubernetes Secrets (Default)

The Helm chart uses native Kubernetes Secrets and automatically installs the required RBAC (Role-Based Access Control) permissions on your cluster. This works for most customers without any additional setup. Storing a Secret: Create a Kubernetes Secret in the same namespace as your Helm release (defaults to default):
kubectl create secret generic my-db-password \
  --from-literal=value='your-secret-value'
If you deployed the Helm chart to a custom namespace, add -n <your-namespace> to the command.

Option 2: GCP Secret Manager (Optional)

Use Workload Identity to grant the ingestion pods access to GCP Secret Manager without managing service account keys. Required IAM role: roles/secretmanager.secretAccessor Update your Helm values to configure the secrets manager and annotate the pod service account:
config:
  secretsManager: "gcp"
  ingestionPods:
    serviceAccount:
      annotations:
        iam.gke.io/gcp-service-account: <gcp-service-account>@<project-id>.iam.gserviceaccount.com
Replace <gcp-service-account> and <project-id> with your GCP values.
Note: The annotation is only required for Workload Identity. Skip it if your cluster uses a different IAM binding approach.
Storing a Secret: In the GCP Console, navigate to Secret Manager > Create Secret. Set the secret name and paste the value as-is.

Referencing a Secret in the Collate UI

For both options, enter the secret reference path in any masked credential field (fields with a hide/show toggle) in the connection form:
secret:<secret-name>
For example:
secret:my-db-password
Secret references only work for masked fields—those with a hide/show toggle icon.